Created attachment 1293601 [details] logs from hypervisor and engine and db dump Description of problem: oVirt 4.2, 3.6 DC: Had the master domain in status Unknown, it was the last domain in DC, so in order to clean the DC, I created an unattached domain and re-initialized successfully the DC using it. Then, I destroyed the domain that was Unknown and right after, the new master domain, which was active, moved to Unknown. The failure is on vdsm on startSpm with CurrentVersionTooAdvancedError: 2017-07-02 12:02:40,199+0300 DEBUG (tasks/4) [storage.SamplingMethod] Returning last result (misc:403) 2017-07-02 12:02:40,199+0300 ERROR (tasks/4) [storage.TaskManager.Task] (Task='a145513d-bc53-4af8-9984-03bae2854b8e') Unexpected error (task:870) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 333, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 304, in startSpm expVer=expectedDomVersion) CurrentVersionTooAdvancedError: Current domain `522e419a-ae04-4f52-be68-486795dc7c90` version is too advanced, expected `3` and found `4`: '' 2017-07-02 12:02:40,199+0300 DEBUG (tasks/4) [storage.TaskManager.Task] (Task='a145513d-bc53-4af8-9984-03bae2854b8e') Task._run: a145513d-bc53-4af8-9984-03bae2854b8e () {} failed - stopping task (task:889) Version-Release number of selected component (if applicable): vdsm-4.20.1-119.gitd6d2a1d.el7.centos.x86_64 ovirt-engine-4.2.0-0.0.master.20170627181935.git9424f9b.el7.centos.noarch libvirt-3.2.0-14.el7.x86_64 qemu-kvm-ev-2.6.0-28.el7.10.1.x86_64 selinux-policy-3.13.1-164.el7.noarch sanlock-3.5.0-1.el7.x86_64 How reproducible: Encountered it twice in RHV automation after domain became Unkown during [1] execution. [1] https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/workitem?id=RHEVM3-5300 Steps to Reproduce: oVirt 4.2: 1. In 3.6 DC, have the last domain, which is master, in Unkown state 2. Create an unattached domain and re-initialize the DC using it 3. Destroy the Unknwon domain Actual results: The master domain, which was active, changed it state to Unknown. startSpm failed with the mentioned exception in vdsm.log Engine.log: 2017-07-02 12:14:46,404+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler3) [2e10eac2] Start SPM Task failed - result: 'cleanSuccess', message: VDSGenericException: VDSErrorException: Failed in vdscommand to HSMGetTaskStatusVDS, error = Current domain `522e419a-ae04-4f52-be68-486795dc7c90` version is too advanced, expected `3` and found `4` Expected results: startSpm should succeed. Additional info: logs from hypervisor and engine and db dump
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Closing old bugs, feel free to reopen if still needed.