Created attachment 812419 [details] vdsm log file Description of problem: Engine VM is not reported by vdsm although it's up and running. Scenario: 1.hosted engine environment with 2 hosts. 2.engine vm runs on 1 host and the second host is unavailable. 3.block connection to storage from the 1 host,so vm should start on the second host but it failed because the second host is unavailable. After resuming connection on the first host I can see that engine vm is up and running (i can connect to it) but according to vdsm it's not exist. Here is the sanlock status output: ------------------------------------------------------------------ [root@purple-vds3 vdsm]# sanlock status daemon 6c067f1a-1056-4314-b825-7f2bf73d3b41.purple-vds p -1 helper p -1 listener p 28340 HostedEngine p 19763 p -1 status s ec7d5406-2cb9-472c-b7e7-2ad6fedf9e3a:1:/dev/ec7d5406-2cb9-472c-b7e7-2ad6fedf9e3a/ids:0 s ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf:1:/rhev/data-center/mnt/orion.qa.lab.tlv.redhat.com\:_export_hosted__engine1/ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf/dom_md/ids:0 s hosted-engine:1:/rhev/data-center/mnt/orion.qa.lab.tlv.redhat.com\:_export_hosted__engine1/ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf/ha_agent/hosted-engine.lockspace:0 r ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf:586a0e62-5f38-4e93-848f-57588e128261:/rhev/data-center/mnt/orion.qa.lab.tlv.redhat.com\:_export_hosted__engine1/ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf/images/b64e7d62-c3cb-44d1-b280-f4e1318720fe/586a0e62-5f38-4e93-848f-57588e128261.lease:0:20 p 28340 ------------------------------------------------------------------ vdsm version: vdsm-4.13.0-13.gitc08fbb6.el6_4.x86_64 vdsm log attached.
what does vdsClient getVmsList report?
vdsClient list shows nothing. vdsClient getVmsList can only be used with the pool and run on SPM. I have no pool.
Created attachment 812604 [details] vdsm.log.3.xz Attaching some additional logs. At first glance this looks like a race between vmCreate and vmDestroy: 10:33:44,290::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 10:33:47,529::BindingXMLRPC::984::vds::(wrapper) client [10.35.102.57]::call vmDestroy with ('ac687a24-05cd- 11:07:51,461::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:07:54,173::BindingXMLRPC::984::vds::(wrapper) client [10.35.102.57]::call vmDestroy with ('ac687a24-05cd- 11:20:00,162::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:20:11,626::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:20:12,928::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:20:23,653::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:20:24,944::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:20:36,199::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:20:37,551::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:20:48,235::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:20:49,522::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:21:00,222::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:21:01,533::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:21:12,199::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:21:13,480::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:21:24,739::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:21:26,022::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': Thread-65601::ERROR::2013-10-15 11:21:22,551::vm::2132::vm.Vm::(_startUnderlyingVm) vmId=`ac687a24-05cd-43fb-9f9d-c53169fa8cd0`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 2092, in _startUnderlyingVm self._run() File "/usr/share/vdsm/vm.py", line 2959, in _run self._connection.createXML(domxml, flags), File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Requested operation is not valid: domain is already active as 'HostedEngine' 11:21:37,233::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:21:38,538::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:21:49,542::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:21:50,831::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:22:01,652::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:22:02,975::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:22:13,848::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:22:15,168::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:22:25,986::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:22:27,313::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:22:38,357::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:22:39,771::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:22:50,659::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:22:52,039::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:23:02,789::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:23:04,095::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:23:15,140::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:23:16,511::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:23:27,233::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:23:28,556::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:23:39,230::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:23:40,521::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 11:23:51,268::BindingXMLRPC::984::vds::(wrapper) client [10.35.109.10]::call vmDestroy with ('ac687a24-05cd- 11:23:52,646::BindingXMLRPC::984::vds::(wrapper) client [127.0.0.1]::call vmCreate with ({'emulatedMachine': 12:46:07,337::BindingXMLRPC::984::vds::(wrapper) client [10.35.102.57]::call vmDestroy with ('ac687a24-05cd- Beside fixing this issue in vdsm we should ask ourselves why the he-agent is issuing these commands.
After discussion with Greg another bug was opened regarding these messages. https://bugzilla.redhat.com/show_bug.cgi?id=1019468
from virt point of view the VM was destroyed by Thread-69404::DEBUG::2013-10-15 12:46:07,337::BindingXMLRPC::984::vds::(wrapper) client [10.35.102.57]::call vmDestroy … and since then it doesn't exist. If someone somewhere somehow started it again without letting vdsm know then it must have been the ha_agent?
any other logs? from ha_agent perhaps? only vdsm is not really helpful, I'm missing the time references and what was going on
(In reply to Michal Skrivanek from comment #8) > any other logs? from ha_agent perhaps? > only vdsm is not really helpful, I'm missing the time references and what > was going on No :( it was a month ago when the bug was opened. And Federico asked vdsm logs,so I attached them.
Greg & Leonid, let's revisit this issue; See if still reproducible. If so who creates / destroys the VM.
I tried to reproduce it with the latest packages and everything looks good. I hope Leonid can confirm. (As an aside, I wonder if this is related to bug 1022924. It seems to have many similarities.)
Indeed. I was unable to reproduce it.
Please re-open if you're able to reproduce.