Bug 1836318 - RHV engine is reporting a delete disk with wipe as completing successfully when it actually fails from a timeout.
Summary: RHV engine is reporting a delete disk with wipe as completing successfully wh...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.3.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.5.3
: ---
Assignee: Mark Kemel
QA Contact: Ilia Markelov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-15 15:42 UTC by Frank DeLorey
Modified: 2023-09-07 23:09 UTC (History)
8 users (show)

Fixed In Version: ovirt-engine-4.5.3.1
Doc Type: If docs needed, set a value
Doc Text:
Previously, when failing to wipe the content of the disk during disk removal, the operation completed successfully with no indication of the wipe operation being timed out. In this release, when failing to wipe the content of the disk and the disk removal fails, the disk is set with ILLEGAL status and an audit log that describes the failure is created.
Clone Of:
Environment:
Last Closed: 2022-11-16 12:17:27 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Engine log from timeframe (11.02 MB, text/plain)
2020-05-15 16:18 UTC, Frank DeLorey
no flags Details
VDSM logs from start of remove disk (18.37 MB, text/plain)
2020-05-15 16:20 UTC, Frank DeLorey
no flags Details
VDSM logs from timeout failure (18.61 MB, text/plain)
2020-05-15 16:21 UTC, Frank DeLorey
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-engine pull 656 0 None Draft core: RemoveImage: handle failure in VDS command 2022-09-18 08:23:12 UTC
Red Hat Product Errata RHSA-2022:8502 0 None None None 2022-11-16 12:17:37 UTC

Description Frank DeLorey 2020-05-15 15:42:19 UTC
Description of problem:
Customer tried to remove a block disk that had wipe-after-delete set and the engine reports it as completing successfully (successfully removed) when it actually fails with a timeout.

Version-Release number of selected component (if applicable):

RHV 4.3.7

How reproducible:
UKNOWN

Steps to Reproduce:
1. Deactivated disk
2. Issued remove disk
3. Engine reports successful

Actual results:

The vdsm task timeout and the disk is deleted from the engine DB but still exists.

Expected results:

The engine should report correctly that the remove disk failed.

Additional info:

ENGINE LOG:

2020-05-12 12:50:43,899+05 INFO  [org.ovirt.engine.core.bll.storage.disk.RemoveDiskCommand] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] Running command: RemoveDiskCommand internal: false. Entities affected :  ID: 8d5779f9-a5a6-4d1d-a705-000cceef73d0 Type: DiskAction group DELETE_DISK with role type USER
2020-05-12 12:50:43,933+05 INFO  [org.ovirt.engine.core.bll.storage.disk.image.RemoveImageCommand] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] Running command: RemoveImageCommand internal: true. Entities affected :  ID: 81dbbab0-46a1-4999-acfb-0f593fa05b58 Type: Storage
2020-05-12 12:50:43,994+05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] START, DeleteImageGroupVDSCommand( DeleteImageGroupVDSCommandParameters:{storagePoolId='352b829e-4d72-11ea-aa35-00163e452e33', ignoreFailoverLimit='false', storageDomainId='81dbbab0-46a1-4999-acfb-0f593fa05b58', imageGroupId='8d5779f9-a5a6-4d1d-a705-000cceef73d0', postZeros='true', discard='false', forceDelete='false'}), log id: 11a2d7b3
2020-05-12 12:50:44,257+05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] FINISH, DeleteImageGroupVDSCommand, return: , log id: 11a2d7b3
2020-05-12 12:50:44,262+05 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] CommandAsyncTask::Adding CommandMultiAsyncTasks object for command '9000a374-3fae-4110-936a-13bbc6297a74'
2020-05-12 12:50:44,262+05 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] CommandMultiAsyncTasks::attachTask: Attaching task 'c2041ad5-b98e-490c-bf8f-52671536189c' to command '9000a374-3fae-4110-936a-13bbc6297a74'.
2020-05-12 12:50:44,271+05 INFO  [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] Adding task 'c2041ad5-b98e-490c-bf8f-52671536189c' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters'), polling hasn't started yet..
2020-05-12 12:50:44,371+05 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] EVENT_ID: USER_FINISHED_REMOVE_DISK_ATTACHED_TO_VMS(2,042), Disk VM-Storage_Disk2 associated to the VMs Sip-Proxy was successfully removed from domain mvpl_data (User admin@internal-authz).
2020-05-12 12:50:44,371+05 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] BaseAsyncTask::startPollingTask: Starting to poll task 'c2041ad5-b98e-490c-bf8f-52671536189c'.
2020-05-12 12:50:44,372+05 INFO  [org.ovirt.engine.core.bll.storage.disk.RemoveDiskCommand] (EE-ManagedThreadFactory-engine-Thread-88609) [6b6cebe6-6368-4278-aa3f-40163da8c7e4] Lock freed to object 'EngineLock:{exclusiveLocks='[8d5779f9-a5a6-4d1d-a705-000cceef73d0=DISK]', sharedLocks='[e89c01f5-411a-4a73-b1d6-0392bb7ceec8=VM]'}'


2020-05-14 14:50:53,675+05 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-82) [] EVENT_ID: TASK_STOPPING_ASYNC_TASK(9,500), Stopping async task RemoveDisk that started at Tue May 12 12:50:43 IST 2020
2020-05-14 14:50:53,675+05 INFO  [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-82) [] Cleaning zombie tasks: Stopping async task 'RemoveDisk' that started at 'Tue May 12 12:50:43 IST 2020' since it reached a timeout of 3000 minutes
2020-05-14 14:50:53,675+05 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-82) [] SPMAsyncTask::StopTask: Attempting to stop task 'c2041ad5-b98e-490c-bf8f-52671536189c' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters').
2020-05-14 14:50:53,682+05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SPMStopTaskVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-82) [] START, SPMStopTaskVDSCommand( SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='352b829e-4d72-11ea-aa35-00163e452e33', ignoreFailoverLimit='false', taskId='c2041ad5-b98e-490c-bf8f-52671536189c'}), log id: 98b806b
2020-05-14 14:50:53,684+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMStopTaskVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-82) [] START, HSMStopTaskVDSCommand(HostName = mpokket-host1, HSMTaskGuidBaseVDSCommandParameters:{hostId='d43be6b5-9d04-4967-bacd-606419f37198', taskId='c2041ad5-b98e-490c-bf8f-52671536189c'}), log id: 23f3f2c6
2020-05-14 14:50:53,687+05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMStopTaskVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-82) [] FINISH, HSMStopTaskVDSCommand, return: , log id: 23f3f2c6
2020-05-14 14:50:53,687+05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SPMStopTaskVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-82) [] FINISH, SPMStopTaskVDSCommand, return: , log id: 98b806b
2020-05-14 14:51:03,705+05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-10) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM mpokket-host1 command HSMGetAllTasksStatusesVDS failed: c2041ad5-b98e-490c-bf8f-52671536189c
2020-05-14 14:51:03,712+05 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-10) [] BaseAsyncTask::logEndTaskFailure: Task 'c2041ad5-b98e-490c-bf8f-52671536189c' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure:
-- Result: 'success'
-- Message: '',
-- Exception: '[null]'
2020-05-14 14:51:03,724+05 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-10) [] CommandAsyncTask::endActionIfNecessary: All tasks of command '9000a374-3fae-4110-936a-13bbc6297a74' has ended -> executing 'endAction'
2020-05-14 14:51:03,724+05 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-10) [] CommandAsyncTask::endAction: Ending action for '1' tasks (command ID: '9000a374-3fae-4110-936a-13bbc6297a74'): calling endAction '.
2020-05-14 14:51:03,725+05 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-136752) [] CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAction 'RemoveDisk',
2020-05-14 14:51:03,727+05 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-10) [] BaseAsyncTask::onTaskEndSuccess: Task 'c2041ad5-b98e-490c-bf8f-52671536189c' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully.
2020-05-14 14:51:03,727+05 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-10) [] Task with DB Task ID '72bcc259-78ee-4b46-96e0-01dd6a0bca4a' and VDSM Task ID 'c2041ad5-b98e-490c-bf8f-52671536189c' is in state AttemptingEndAction. End action for command 9000a374-3fae-4110-936a-13bbc6297a74 will proceed when all the entity's tasks are completed.

>>> LVS:

07df6929-7c6b-427d-90fb-2cb6315b0d62 81dbbab0-46a1-4999-acfb-0f593fa05b58  204 -wi-a-----  <5.63t  -1  -1  253   28                                                     RDWBzk-HOgi-91Qh-Asdd-qrLN-6bSb-pZf01P          IU__remove_me_ZERO_8d5779f9-a5a6-4d1d-a705-000cceef73d0,MD_11,PU_00000000-0000-0000-0000-000000000000

/rhev/data-center/mnt/blockSD/81dbbab0-46a1-4999-acfb-0f593fa05b58/images/8d5779f9-a5a6-4d1d-a705-000cceef73d0:
total 4
lrwxrwxrwx. 1 vdsm kvm 78 Apr 14 14:38 07df6929-7c6b-427d-90fb-2cb6315b0d62 -> /dev/81dbbab0-46a1-4999-acfb-0f593fa05b58/07df6929-7c6b-427d-90fb-2cb6315b0d62


>>> VDSM LOGS
2020-05-12 12:48:04,905+0530 INFO  (jsonrpc/2) [api.virt] START hotunplugDisk(params={u'xml': u'<?xml version="1.0" encoding="UTF-8" standalone="yes"?><hotunplug><devices><disk><alias name="ua-8d5779f9-a5a6-4d1d-a705-000cceef73d0"/></disk></devices></hotunplug>', u'vmId': u'e89c01f5-411a-4a73-b1d6-0392bb7ceec8'}) from=::ffff:192.168.0.55,39268, flow_id=049dfdfb-bf07-48d1-8610-286d4dc06253, vmId=e89c01f5-411a-4a73-b1d6-0392bb7ceec8 (api:48)
2020-05-12 12:48:04,907+0530 INFO  (jsonrpc/2) [virt.vm] (vmId='e89c01f5-411a-4a73-b1d6-0392bb7ceec8') Hotunplug disk xml: <?xml version='1.0' encoding='utf-8'?>
<disk device="disk" snapshot="no" type="block">
    <address bus="0" controller="0" target="0" type="drive" unit="2" />
    <source dev="/rhev/data-center/mnt/blockSD/81dbbab0-46a1-4999-acfb-0f593fa05b58/images/8d5779f9-a5a6-4d1d-a705-000cceef73d0/07df6929-7c6b-427d-90fb-2cb6315b0d62">
        <seclabel model="dac" relabel="no" type="none" />
    </source>
    <target bus="scsi" dev="sdc" />
    <serial>8d5779f9-a5a6-4d1d-a705-000cceef73d0</serial>
    <driver cache="none" error_policy="stop" io="native" name="qemu" type="qcow2" />
    <alias name="ua-8d5779f9-a5a6-4d1d-a705-000cceef73d0" />
</disk>
 (vm:3862)
2020-05-12 12:48:04,949+0530 INFO  (libvirt/events) [virt.vm] (vmId='e89c01f5-411a-4a73-b1d6-0392bb7ceec8') Device removal reported: ua-8d5779f9-a5a6-4d1d-a705-000cceef73d0 (vm:6046)
2020-05-12 12:48:04,969+0530 INFO  (jsonrpc/2) [vdsm.api] START teardownImage(sdUUID='81dbbab0-46a1-4999-acfb-0f593fa05b58', spUUID='352b829e-4d72-11ea-aa35-00163e452e33', imgUUID='8d5779f9-a5a6-4d1d-a705-000cceef73d0', volUUID=None) from=::ffff:192.168.0.55,39268, flow_id=049dfdfb-bf07-48d1-8610-286d4dc06253, task_id=bf659180-7bb7-4139-8032-ea2bc4eb7540 (api:48)
2020-05-12 12:48:04,970+0530 INFO  (jsonrpc/2) [storage.StorageDomain] Removing image run directory '/var/run/vdsm/storage/81dbbab0-46a1-4999-acfb-0f593fa05b58/8d5779f9-a5a6-4d1d-a705-000cceef73d0' (blockSD:1448)
2020-05-12 12:48:04,970+0530 INFO  (jsonrpc/2) [storage.fileUtils] Removing directory: /var/run/vdsm/storage/81dbbab0-46a1-4999-acfb-0f593fa05b58/8d5779f9-a5a6-4d1d-a705-000cceef73d0 (fileUtils:180)

2020-05-12 12:48:30,521+0530 INFO  (jsonrpc/5) [vdsm.api] START deleteImage(sdUUID=u'81dbbab0-46a1-4999-acfb-0f593fa05b58', spUUID=u'352b829e-4d72-11ea-aa35-00163e452e33', imgUUID=u'8d5779f9-a5a6-4d1d-a705-000cceef73d0', postZero=u'true', force=u'false', discard=False) from=::ffff:192.168.0.55,39286, flow_id=6b6cebe6-6368-4278-aa3f-40163da8c7e4, task_id=c2041ad5-b98e-490c-bf8f-52671536189c (api:48)

2020-05-12 12:48:30,766+0530 INFO  (jsonrpc/5) [vdsm.api] FINISH deleteImage return=None from=::ffff:192.168.0.55,39286, flow_id=6b6cebe6-6368-4278-aa3f-40163da8c7e4, task_id=c2041ad5-b98e-490c-bf8f-52671536189c (api:54)
2020-05-12 12:48:30,780+0530 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Image.delete succeeded in 0.26 seconds (__init__:312)

2020-05-12 12:48:30,780+0530 INFO  (tasks/6) [storage.ThreadPool.WorkerThread] START task c2041ad5-b98e-490c-bf8f-52671536189c (cmd=<bound method Task.commit of <vdsm.storage.task.Task instance at 0x7f301818fb48>>, args=None) (threadPool:208)
2020-05-12 12:48:31,316+0530 INFO  (jsonrpc/1) [vdsm.api] FINISH getAllTasksStatuses return={'allTasksStatus': {'c2041ad5-b98e-490c-bf8f-52671536189c': {'code': 0, 'message': 'running job 1 of 1', 'taskState': 'running', 'taskResult': '', 'taskID': 'c2041ad5-b98e-490c-bf8f-52671536189c'}}} from=::ffff:192.168.0.55,39268, task_id=62490b0a-fb24-4753-9623-2d6f850e2e40 (api:54)

2020-05-14 14:01:05,164+0530 INFO  (jsonrpc/6) [vdsm.api] FINISH getAllTasksStatuses return={'allTasksStatus': {'c2041ad5-b98e-490c-bf8f-52671536189c': {'code': 0, 'message': 'running job 1 of 1', 'taskState': 'running', 'taskResult': '', 'taskID': 'c2041ad5-b98e-490c-bf8f-52671536189c'}}} from=::ffff:192.168.0.55,39268, task_id=33cd3664-e9b8-4fca-8065-b6ad7f22b18d (api:54)

2020-05-14 14:48:37,075+0530 INFO  (jsonrpc/2) [vdsm.api] FINISH getAllTasksStatuses return={'allTasksStatus': {'c2041ad5-b98e-490c-bf8f-52671536189c': {'code': 0, 'message': 'running job 1 of 1', 'taskState': 'running', 'taskResult': '', 'taskID': 'c2041ad5-b98e-490c-bf8f-52671536189c'}}} from=::ffff:192.168.0.55,39268, task_id=bca1aed4-4cb5-4fdf-801d-33bb6042c637 (api:54)
2020-05-14 14:48:37,075+0530 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getAllTasksStatuses succeeded in 0.00 seconds (__init__:312)
2020-05-14 14:48:37,093+0530 INFO  (jsonrpc/4) [vdsm.api] START stopTask(taskID=u'c2041ad5-b98e-490c-bf8f-52671536189c', spUUID=None, options=None) from=::ffff:192.168.0.55,39268, task_id=dae564d4-354c-4a22-bb05-bda97f870187 (api:48)
2020-05-14 14:48:37,093+0530 INFO  (jsonrpc/4) [storage.operation] Aborting <Command ['/sbin/blkdiscard', '--zeroout', '--step', '2097152', '--length', '6190121615360', u'/dev/81dbbab0-46a1-4999-acfb-0f593fa05b58/07df6929-7c6b-427d-90fb-2cb6315b0d62'] aborting, cwd=None at 0x7f2ff8081d90> (operation:127)

2020-05-14 14:48:37,093+0530 INFO  (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call Task.stop succeeded in 0.00 seconds (__init__:312)
2020-05-14 14:48:37,190+0530 ERROR (tasks/6) [storage.TaskManager.Task] (Task='c2041ad5-b98e-490c-bf8f-52671536189c') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 1375, in zeroImage
    zeroImgVolumes(sdUUID, imgUUID, toZero, discard)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 302, in zeroImgVolumes
    raise se.VolumesZeroingError(errors)
VolumesZeroingError: Cannot zero out volume: (['Action was stopped: ()'],)
2020-05-14 14:48:37,228+0530 INFO  (tasks/6) [storage.ThreadPool.WorkerThread] FINISH task c2041ad5-b98e-490c-bf8f-52671536189c (threadPool:210)

2020-05-14 14:48:47,105+0530 INFO  (jsonrpc/7) [vdsm.api] FINISH getAllTasksStatuses return={'allTasksStatus': {'c2041ad5-b98e-490c-bf8f-52671536189c': {'code': 411, 'message': u'c2041ad5-b98e-490c-bf8f-52671536189c', 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID': 'c2041ad5-b98e-490c-bf8f-52671536189c'}}} from=::ffff:192.168.0.55,39268, task_id=bd026500-baac-4df9-8956-66d544669a22 (api:54)
2020-05-14 14:48:47,105+0530 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllTasksStatuses succeeded in 0.00 seconds (__init__:312)
2020-05-14 14:48:47,139+0530 INFO  (jsonrpc/6) [vdsm.api] START clearTask(taskID=u'c2041ad5-b98e-490c-bf8f-52671536189c', spUUID=None, options=None) from=::ffff:192.168.0.55,39268, task_id=36dc0bac-9f08-4a26-bc67-507bfd7d82de (api:48)
2020-05-14 14:48:47,141+0530 INFO  (jsonrpc/6) [vdsm.api] FINISH clearTask return=None from=::ffff:192.168.0.55,39268, task_id=36dc0bac-9f08-4a26-bc67-507bfd7d82de (api:54)

Comment 1 Frank DeLorey 2020-05-15 16:18:42 UTC
Created attachment 1689002 [details]
Engine log from timeframe

Comment 2 Frank DeLorey 2020-05-15 16:20:17 UTC
Created attachment 1689004 [details]
VDSM logs from start of remove disk

Comment 3 Frank DeLorey 2020-05-15 16:21:53 UTC
Created attachment 1689005 [details]
VDSM logs from timeout failure

Comment 4 Nir Soffer 2020-05-18 17:52:36 UTC
This is known issue. Engine must track deleted disks until the disk is actually deleted
from storage. We have other bugs about the same issue.

Deferring to 4.5 since this require redesign of the way disks are tracked in engine and
4.4 development phases has ended.

Comment 5 Eyal Shenitzky 2021-08-29 08:44:58 UTC
This bug/RFE didn't get enough attention so far and is now flagged as pending close. 
Please review if it is still relevant and provide additional details/justification/patches
if you believe it should get more attention for the next oVirt release.

Comment 6 Frank DeLorey 2021-08-29 08:52:36 UTC
This should not be closed as it can still happen in RHV 4.4. Nir stated that this bug is a known issue and that there may be other bugs tracking this so if this is to be closed then the other bugs should be noted so we can still track the fix for this problem.

Frank

Comment 7 Michal Skrivanek 2021-09-29 10:14:10 UTC
This didn't get into 4.5 either, it's time to close this unless there is any new information or reproduction scenario

Comment 10 Michal Skrivanek 2022-04-11 08:30:21 UTC
no new information, didn't make it into 4.5, closing

Comment 11 Arik 2022-04-13 12:39:33 UTC
The implication of ignoring this one is that we remain with junk on the storage side
Which is maybe not that bad, but we are able to identify the failure on VDSM (according to comment 1) so let's give a try to monitor that property on the engine side instead of dropping the disk from the database

Comment 12 Arik 2022-05-24 08:46:53 UTC
We noticed in engine.log that RemoveDisk failed but probably after the relevant parts were already removed from the database

Comment 17 Ilia Markelov 2022-10-14 08:59:14 UTC
Verified.

The disk was successfully removed and doesn't exist.

Versions:
engine-4.5.3.1-2.el8ev
vdsm-4.50.3.2-1.el8ev.x86_64

Comment 19 sshmulev 2022-10-23 13:05:41 UTC
The RemoveDisk finishes before the disk was removed from the DB


Description:
The job for RemoveDisk finishes before the disk was actually removed from the DB.
This causes failures in our automation tests in all tiers in the teardown when removing a disk and afterward removing a VM - which fails because the disk is still being, and a VM can't be removed when disks are locked.

How reproducible:
100%

Versions:
ovirt-engine-4.5.3.1-2.el8ev
vdsm-4.50.3.4-1.el8ev

Comment 20 Arik 2022-10-23 19:14:04 UTC
The priority of *this bug* should not be urgent and the issue described in comment 19 is unrelated to the reported scenario - it's a consequence of the fix that affects our automation tests and deserves its own bug (where we can discuss its severity)

Comment 24 errata-xmlrpc 2022-11-16 12:17:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: RHV Manager (ovirt-engine) [ovirt-4.5.3] bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8502


Note You need to log in before you can comment on or make changes to this bug.