Created attachment 611756 [details] ## Logs vdsm, rhevm Description of problem: During deleting action multiple floating disks and restart vdsm service action, some disks get "locked" status Version-Release number of selected component (if applicable): RHEVM 3.1 - SI17 RHEVM: rhevm-3.1.0-15.el6ev.noarch VDSM: vdsm-4.9.6-32.0.el6_3.x86_64 LIBVIRT: libvirt-0.9.10-21.el6_3.4.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.295.el6_3.2.x86_64 SANLOCK: sanlock-2.3-3.el6_3.x86_64 How reproducible: Occasionally Steps to Reproduce: 1. Create iSCSI DC with 2 hosts 2. Create 12 floating disk 3. Select them all, and delete 4. During deleting restart “vdsm” service on SPM server (run command #: service vdsmd stop && service vdsmd start) Actual results: 1. Same disk get “Locked” status 2. No option delete them 3. In DB deleting task is exist (see attached log) 4. No tasks are running on both hosts (run command #: vdsClient -s 0 getAllTasksInfo) Expected results: No “Locked” disks Option remove (delete them) them Clean task from DB Additional info: In DB “async_tasks” table: “delete” task are running In VDSM (vdsClient -s 0 getAllTasksInfo) : no tasks are running In DB “all_disks” table, disk have status: imagestatus == 2
psql -U postgres engine -c 'select disk_id,disk_alias,imagestatus from all_disks where imagestatus = 2;' | less -S disk_id | disk_alias | imagestatus --------------------------------------+------------+------------- efad1726-9da3-4937-a4f2-8e9f2e9ed37b | A-02 | 2
After 15 hours, “delete” task still running, and not released
(In reply to comment #2) > After 15 hours, “delete” task still running, and not released the task reaper runs after 30 or 50 hours, I don't recall which.
for dead tasks in vdsm it will run after 50-60 hours. for engine db clean up it should be about 5 hours.
*** Bug 856135 has been marked as a duplicate of this bug. ***
On verification this bug please run scenario from BZ856135
*** Bug 866886 has been marked as a duplicate of this bug. ***
Could not reproduce, this patch seems to solve it: http://gerrit.ovirt.org/#/c/11075/
Verified on SF10. When reproduced, vdsm has been restarted during the removing of the 12 disks. The disks became 'illegal' and then I was manage to remove them.
3.2 has been released