Created attachment 1539483 [details] logs Description of problem: Start VM failure with LibVirtError Failed to acquire lock: No space left on device. Looks like the same issue as in this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1599732 The bug above is already verified but it seems the same error with the same test case. Following this comment: https://bugzilla.redhat.com/show_bug.cgi?id=1599732#c9 I added "refresh capabilities" to the test case steps before starting the VM. From the engine log (same for all hosts retries): 2019-02-26 05:18:55,090+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-7) [] VM '516dbb5c-a5f5-423f-b91f-15930d3c1990'(vm_0_TestCase25515_2605155367) moved from 'WaitForLaunch' --> 'Down' 2019-02-26 05:18:55,113+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-7) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm_0_TestCase25515_2605155367 is down with error. Exit message: Failed to acquire lock: No space left on device. 2019-02-26 05:18:55,113+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-7) [] add VM '516dbb5c-a5f5-423f-b91f-15930d3c1990'(vm_0_TestCase25515_2605155367) to rerun treatment 2019-02-26 05:18:55,127+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-7) [] Rerun VM '516dbb5c-a5f5-423f-b91f-15930d3c1990'. Called from VDS 'host_mixed_1' 2019-02-26 05:18:55,146+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-13222) [] EVENT_ID: USER_INITIATED_RUN_VM_FAILED(151), Failed to run VM vm_0_TestCase25515_2605155367 on Host host_mixed_1. From VDSM log (same for all hosts): 2019-02-26 05:18:54,175+0200 ERROR (vm/516dbb5c) [virt.vm] (vmId='516dbb5c-a5f5-423f-b91f-15930d3c1990') The vm start process failed (vm:937) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 866, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2855, in _run dom.createWithFlags(flags) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) libvirtError: Failed to acquire lock: No space left on device 2019-02-26 05:18:54,175+0200 INFO (vm/516dbb5c) [virt.vm] (vmId='516dbb5c-a5f5-423f-b91f-15930d3c1990') Changed state to Down: Failed to acquire lock: No space left on device (code=1) (vm:1675) Version-Release number of selected component (if applicable): ovirt-engine-4.3.1.1-0.1.el7.noarch How reproducible: Managed to reproduce it only after ~6 executions with the same environment and same errors. Steps to Reproduce (according to the TestCase): 1. Create an HA VM with lease reside on default storage domain 2. Move the storage domain to maintenance 3. Try to start VM (expected result -> fail) 4. Activate the storage domain where the lease resides on 5. Start VM Actual results: The VM failed to start Expected results: The VM should start Additional info: Attached relevant logs
Tested using: ovirt-engine-4.3.2.1-0.1.el7.noarch After discussing with Eyal about this, I've changed the test case so it will try to start the VM a few times and wait ~10 seconds between one try to another (total of ~60 seconds or 6 tries). The VM starts successfully. Moving to VERIFIED
This bugzilla is included in oVirt 4.3.2 release, published on March 19th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.
*** Bug 1715128 has been marked as a duplicate of this bug. ***