Hide Forgot
Description of problem: vm recover flow fails after service is restarted due lack of connection to pool. when i search the logs, i clearly see that recover is called before the connection to pool. Thread-35::DEBUG::2012-02-26 15:25:02,384::vm::565::vm.Vm::(_startUnderlyingVm) vmId=`a194d529-2a5a-4ff0-aced-942ec0a41d8f`::_ongoingCreations acquired Thread-35::INFO::2012-02-26 15:25:02,387::libvirtvm::1229::vm.Vm::(_run) vmId=`a194d529-2a5a-4ff0-aced-942ec0a41d8f`::VM wrapper has started Thread-35::WARNING::2012-02-26 15:25:02,388::vm::396::vm.Vm::(getConfDevices) vmId=`a194d529-2a5a-4ff0-aced-942ec0a41d8f`::Unknown type found, device: '{'device': u'unix', 'alias': u'channel0', 'type': u'channel', 'address': {u'bus': u' 0', u'controller': u'0', u'type': u'virtio-serial', u'port': u'1'}}' found Thread-35::WARNING::2012-02-26 15:25:02,389::vm::396::vm.Vm::(getConfDevices) vmId=`a194d529-2a5a-4ff0-aced-942ec0a41d8f`::Unknown type found, device: '{'device': u'spicevmc', 'alias': u'channel1', 'type': u'channel', 'address': {u'bus' : u'0', u'controller': u'0', u'type': u'virtio-serial', u'port': u'2'}}' found Thread-35::INFO::2012-02-26 15:25:02,391::vm::574::vm.Vm::(_startUnderlyingVm) vmId=`a194d529-2a5a-4ff0-aced-942ec0a41d8f`::Skipping errors on recovery Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 567, in _startUnderlyingVm self._run() File "/usr/share/vdsm/libvirtvm.py", line 1272, in _run self._devices[devType].append(devClass(self.conf, self.log, **dev)) File "/usr/share/vdsm/libvirtvm.py", line 989, in __init__ self.blockDev = utils.isBlockDevice(self.path) File "/usr/share/vdsm/utils.py", line 52, in isBlockDevice return stat.S_ISBLK(os.stat(path).st_mode) OSError: [Errno 2] No such file or directory: '/rhev/data-center/711a080f-4702-450c-9f5f-bf54f1e99383/45290416-1dd4-494d-8ac9-29471752bd21/images/4d7d52fa-ff77-4089-a19c-9fb6a2f47e39/78c1e6de-ad7e-4f94-8ab9-3f748f75d8a0' Repro: - run vms - restart vdsm - recover vm flow starts - recover vm flow fails Notes: - vms are reported as running in libvirt - vms are reported as paused on vdsm git commit: b39d55059fc6cbc4fb3de522e45ea2381f68a3a7
Created attachment 565897 [details] vdsm log
1.HSM:__init__ def storageRefresh(): try: self.__cleanStorageRepository()-->this cleans the spUUID directory .... for spUUID in dirList: self.log.info("lvroyce find storage pool file") poolPath = os.path.join(self.storage_repository, spUUID) try: if os.path.exists(poolPath):-->will not exsisted and will not connect self.log.info("lvroyce find storage pool path") self._connectStoragePool(spUUID, None, None, None, None) 2.threading.Thread(target=storageRefresh).start() is a thread recover vm is also a thread even if storageRefresh connectted the pool,if it is later then recover vm,the recover process will fail. can be seen from your log: Thread-49::INFO::2012-02-26 14:39:13,470::vm::574::vm.Vm::(_startUnderlyingVm) vmId=`34c67b37-9ace-4a39-9040-71cdc0f1e397`::Skipping errors on recovery Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 567, in _startUnderlyingVm self._run() File "/usr/share/vdsm/libvirtvm.py", line 1272, in _run self._devices[devType].append(devClass(self.conf, self.log, **dev)) File "/usr/share/vdsm/libvirtvm.py", line 989, in __init__ self.blockDev = utils.isBlockDevice(self.path) File "/usr/share/vdsm/utils.py", line 52, in isBlockDevice return stat.S_ISBLK(os.stat(path).st_mode) Thread-11::DEBUG::2012-02-26 14:39:13,534::hsm::353::Storage.HSM::(__cleanStorageRepository) Finished cleaning storage repository at '/rhev/data-center'
BZ#797526 Wait pool connection before recovery In this patch: * Serialize connectStoragePool and _recoverExistingVms * Don't check if the pool directory is present to reconnect to the pool (_connectStoragePool), in fact the pool directory was removed few lines before by __cleanStorageRepository * Prepare the volume paths (preparePaths) also during the recovery Signed-off-by: Federico Simoncelli <fsimonce> Change-Id: I418ce9641d1cdb7f7202fd8c08e7c027d1f6d325 http://gerrit.ovirt.org/4283
Recently I proposed a better fix for the issue: BZ#797526 Lazy evaluation for Drive.blockDev Signed-off-by: Federico Simoncelli <fsimonce> Change-Id: I93bc95650544372cec8883a8b7b568ecf419c129 http://gerrit.ovirt.org/4339
*** Bug 819460 has been marked as a duplicate of this bug. ***
I am changing this bug to ASSIGNED, since I hit this issue again on vdsm-4.9.6-13.el6.x86_64. If any mistake, please move it back. A slice of vdsm log, will upload the whole vdsm log soon: Thread-231::DEBUG::2012-06-05 13:25:14,627::BindingXMLRPC::864::vds::(wrapper) client [10.66.11.208]::call vmCont with ('fdabf547-909a-4242-b957-be6d60bf1ca7',) {} Thread-231::ERROR::2012-06-05 13:25:14,629::BindingXMLRPC::879::vds::(wrapper) unexpected error Traceback (most recent call last): File "/usr/share/vdsm/BindingXMLRPC.py", line 869, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/BindingXMLRPC.py", line 210, in vmCont return vm.cont() File "/usr/share/vdsm/API.py", line 121, in cont return v.cont() File "/usr/share/vdsm/vm.py", line 823, in cont self._underlyingCont() File "/usr/share/vdsm/libvirtvm.py", line 1581, in _underlyingCont hooks.before_vm_cont(self._dom.XMLDesc(0), self.conf) AttributeError: 'NoneType' object has no attribute 'XMLDesc' # vdsClient -s 0 list table 5fd21790-5d3b-4be3-accd-d05cec96db86 15816 win7-16-6-1 Up 10.66.108.78 10.66.109.239 404eacb0-ac9d-40a8-80ff-ebaecccc97bd 6919 win2k8r2-16-6-2 Up 10.66.110.78 10.66.107.81 39a88c9e-35a8-4fde-b1fa-d5a0e6534319 7240 rhel6u3-16-6-1 Up 10.66.106.15 10.66.109.250 bf8c3501-94e4-40b3-93a9-b7d4a23b7ccc 2331 rhel5u8-16-6-2 Up 10.66.108.253 10.66.110.249 68b2308d-df40-475f-94d2-c396d0857233 13498 win2k8r2-16-6-1 Up 10.66.107.227 10.66.105.3 fdabf547-909a-4242-b957-be6d60bf1ca7 12617 win7-16-6-2 Paused cb9c3a4c-f1e5-462a-ba4a-cf1973c00485 5957 rhel6u2-16-6-1 Up 10.66.108.255 10.66.107.254 91192d02-5ecf-43ec-b5d3-6b616e6fc513 2835 rhel5u8-16-6-1 Up 10.66.110.252 9856096f-d7fc-4285-b7f8-a924ab04201e 3393 rhel6u2-16-6-2 Up 10.66.110.250 10.66.110.248 10b0f8ec-24ad-45c5-b72e-5f549d202337 5415 rhel6u3-16-6-2 Up 10.66.109.252 10.66.106.244 # vdsClient -s 0 continue fdabf547-909a-4242-b957-be6d60bf1ca7 Unexpected exception
Created attachment 589388 [details] unexpected exception (Error code:16) on vdsm-13 # rpm -q vdsm libvirt qemu-kvm-rhev vdsm-4.9.6-13.el6.x86_64 libvirt-0.9.10-21.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.294.el6.x86_64
moving to rhel-6.3.z as this is a rhev-3.1 blocker, not a rhel-6.3.0 one.
The VM object initialization was stopped by such (In reply to comment #11) > I am changing this bug to ASSIGNED, since I hit this issue again on > vdsm-4.9.6-13.el6.x86_64. If any mistake, please move it back. > > A slice of vdsm log, will upload the whole vdsm log soon: > Thread-231::DEBUG::2012-06-05 > 13:25:14,627::BindingXMLRPC::864::vds::(wrapper) client [10.66.11.208]::call > vmCont with ('fdabf547-909a-4242-b957-be6d60bf1ca7',) {} > Thread-231::ERROR::2012-06-05 > 13:25:14,629::BindingXMLRPC::879::vds::(wrapper) unexpected error > Traceback (most recent call last): > File "/usr/share/vdsm/BindingXMLRPC.py", line 869, in wrapper > res = f(*args, **kwargs) > File "/usr/share/vdsm/BindingXMLRPC.py", line 210, in vmCont > return vm.cont() > File "/usr/share/vdsm/API.py", line 121, in cont > return v.cont() > File "/usr/share/vdsm/vm.py", line 823, in cont > self._underlyingCont() > File "/usr/share/vdsm/libvirtvm.py", line 1581, in _underlyingCont > hooks.before_vm_cont(self._dom.XMLDesc(0), self.conf) > AttributeError: 'NoneType' object has no attribute 'XMLDesc' The traceback and the logs you attached don't contain anything related to comment #0. In fact the issue here is generated by something different: Traceback (most recent call last): File "/usr/share/vdsm/clientIF.py", line 275, in _recoverExistingVms vmObj.preparePaths(vmObj.getConfDevices()[vm.DISK_DEVICES]) File "/usr/share/vdsm/vm.py", line 402, in getConfDevices self.normalizeDrivesIndices(devices[DISK_DEVICES]) File "/usr/share/vdsm/vm.py", line 540, in normalizeDrivesIndices if not self._usedIndices.has_key(drv['iface']): KeyError: 'iface'
(In reply to comment #14) > The VM object initialization was stopped by such > (In reply to comment #11) > > I am changing this bug to ASSIGNED, since I hit this issue again on > > vdsm-4.9.6-13.el6.x86_64. If any mistake, please move it back. > > > > A slice of vdsm log, will upload the whole vdsm log soon: > > Thread-231::DEBUG::2012-06-05 > > 13:25:14,627::BindingXMLRPC::864::vds::(wrapper) client [10.66.11.208]::call > > vmCont with ('fdabf547-909a-4242-b957-be6d60bf1ca7',) {} > > Thread-231::ERROR::2012-06-05 > > 13:25:14,629::BindingXMLRPC::879::vds::(wrapper) unexpected error > > Traceback (most recent call last): > > File "/usr/share/vdsm/BindingXMLRPC.py", line 869, in wrapper > > res = f(*args, **kwargs) > > File "/usr/share/vdsm/BindingXMLRPC.py", line 210, in vmCont > > return vm.cont() > > File "/usr/share/vdsm/API.py", line 121, in cont > > return v.cont() > > File "/usr/share/vdsm/vm.py", line 823, in cont > > self._underlyingCont() > > File "/usr/share/vdsm/libvirtvm.py", line 1581, in _underlyingCont > > hooks.before_vm_cont(self._dom.XMLDesc(0), self.conf) > > AttributeError: 'NoneType' object has no attribute 'XMLDesc' > > The traceback and the logs you attached don't contain anything related to > comment #0. In fact the issue here is generated by something different: > > Traceback (most recent call last): > File "/usr/share/vdsm/clientIF.py", line 275, in _recoverExistingVms > vmObj.preparePaths(vmObj.getConfDevices()[vm.DISK_DEVICES]) > File "/usr/share/vdsm/vm.py", line 402, in getConfDevices > self.normalizeDrivesIndices(devices[DISK_DEVICES]) > File "/usr/share/vdsm/vm.py", line 540, in normalizeDrivesIndices > if not self._usedIndices.has_key(drv['iface']): > KeyError: 'iface' So, I need to open another bz?
(In reply to comment #15) > (In reply to comment #14) > > The traceback and the logs you attached don't contain anything related to > > comment #0. In fact the issue here is generated by something different: > > > > Traceback (most recent call last): > > File "/usr/share/vdsm/clientIF.py", line 275, in _recoverExistingVms > > vmObj.preparePaths(vmObj.getConfDevices()[vm.DISK_DEVICES]) > > File "/usr/share/vdsm/vm.py", line 402, in getConfDevices > > self.normalizeDrivesIndices(devices[DISK_DEVICES]) > > File "/usr/share/vdsm/vm.py", line 540, in normalizeDrivesIndices > > if not self._usedIndices.has_key(drv['iface']): > > KeyError: 'iface' > > So, I need to open another bz? Yes please.
(In reply to comment #16) > (In reply to comment #15) > > (In reply to comment #14) > > > The traceback and the logs you attached don't contain anything related to > > > comment #0. In fact the issue here is generated by something different: > > > > > > Traceback (most recent call last): > > > File "/usr/share/vdsm/clientIF.py", line 275, in _recoverExistingVms > > > vmObj.preparePaths(vmObj.getConfDevices()[vm.DISK_DEVICES]) > > > File "/usr/share/vdsm/vm.py", line 402, in getConfDevices > > > self.normalizeDrivesIndices(devices[DISK_DEVICES]) > > > File "/usr/share/vdsm/vm.py", line 540, in normalizeDrivesIndices > > > if not self._usedIndices.has_key(drv['iface']): > > > KeyError: 'iface' > > > > So, I need to open another bz? > > Yes please. New bz entry opened at: https://bugzilla.redhat.com/show_bug.cgi?id=829566
verified on si24.4 vdsm-4.9.6-44.0.el6_3.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-1508.html