Created attachment 602747 [details] engine.log * Import a template from an export domain to a data domain. * Restart VDSMD. * After VDSM comes back and becomes SPM, the template (and its disk) remain stuck in Locked status.
Created attachment 602748 [details] vdsm.log
As far as I can see, we are stuck in an infinite loop of trying to import the template and failing. I'm seeing an event of "Failed to complete copy of Template" every 10 seconds.
vdsm log does not contain the restart part or anything that happened before (after?) nor any copyImage/moveImage command. engine log doesn't seem to contain anything relevant either? no excerpt here to point at issue seen?
(In reply to comment #4) > vdsm log does not contain the restart part or anything that happened before > (after?) nor any copyImage/moveImage command. > > engine log doesn't seem to contain anything relevant either? > no excerpt here to point at issue seen? I collected the logs a few minutes after it happened. If you don't have enough info in the logs, please have someone contact me to see the hosts themselves. Even now, a day later, I still have tasks on the VDSM host: [root@orange-vdsc ~]# vdsClient -s 0 getAllTasksInfo df6971bb-eb29-4183-b0d3-7116cfd2352e : verb = copyImage id = df6971bb-eb29-4183-b0d3-7116cfd2352e f71d3b61-b51e-4bde-bddc-d6e1f7812aa1 : verb = deleteImage id = f71d3b61-b51e-4bde-bddc-d6e1f7812aa1
In your reproduction steps you wrote you restarted vdsmd. That means that log must contain "I am vdsm..." etc (the startup flow). As far as I can tell, it does not. Could be that log rotated, but looks like this is the wrong log.
Created attachment 603045 [details] engine.log Attaching more logs, presumably these are the right ones.
Created attachment 603046 [details] vdsm.log
After looking at the vdsm.log, there are 2 problems here: 1. after spmStart engine is sending stopTask to a task which is in aborting state (i.e. it's cleaning up) causing the task to stop the cleanup and leave garbage behind. 2. after task stops, engine keeps on calling getAllTasksStatuses which clearly states the task is finished but doesn't do anything about it. Thread-56::INFO::2012-08-07 16:03:00,833::logUtils::37::dispatcher::(wrapper) Run and protect: getAllTasksStatuses(spUUID=None, options=None) ... Thread-56::INFO::2012-08-07 16:03:00,835::logUtils::39::dispatcher::(wrapper) Run and protect: getAllTasksStatuses, Return response: {'allTasksStatus': {'df6971bb-eb29-4183-b0d3-7116cfd2352e': {'code': 0, 'message': 'Task is initializing', 'taskState': 'finished', 'taskResult': 'cleanFailure', 'taskID': 'df6971bb-eb29-4183-b0d3-7116cfd2352e'}}} This is an infra issue.
http://gerrit.ovirt.org/#/c/8057/
fixed in commit: f4e4850
Verified on RHEVM 3.1 - SI19 RHEVM: rhevm-3.1.0-18.el6ev.noarch VDSM: vdsm-4.9.6-36.0.el6_3.x86_64 LIBVIRT: libvirt-0.9.10-21.el6_3.4.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.295.el6_3.2.x86_64 SANLOCK: sanlock-2.3-4.el6_3.x86_64