Bug 1650985 - After failed MIXED LSM - correct problem - second attempt fails with Cannot create Logical Volume as it already exists in volume group code = 550
Summary: After failed MIXED LSM - correct problem - second attempt fails with Cannot c...
Keywords:
Status: CLOSED DUPLICATE of bug 1611929
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
high with 1 vote vote
Target Milestone: ovirt-4.3.5
: ---
Assignee: Benny Zlotnik
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks: 1565685
TreeView+ depends on / blocked
 
Reported: 2018-11-18 13:26 UTC by Avihai
Modified: 2019-05-28 12:09 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-05-28 12:09:59 UTC
oVirt Team: Storage
rule-engine: ovirt-4.3+


Attachments (Terms of Use)
engine and vdsm logs (1.91 MB, application/x-gzip)
2018-11-18 13:26 UTC, Avihai
no flags Details

Description Avihai 2018-11-18 13:26:48 UTC
Created attachment 1506955 [details]
engine and vdsm logs

Description of problem:
Fail 1st LSM(NFS->ISCSI) blocking its source SD(NFS) from SPM and wait for LSM to fail.
After it fails unblock connection and wait until all hosts/SD are up/available and try an the same LSM(NFS->ISCSI) this time unblocked.

LSM fails with:
Engine:
2nd LSM
2018-11-18 14:55:38,024+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-14) [7a76d927-bc21-4384-9520-4b2d38137b15] EVENT_ID: USER_MOVED_DISK(2,008), User admin@inter
nal-authz moving disk vm1_Disk2 to domain iscsi_0.

LSM failed :
2018-11-18 14:56:01,495+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] Failed in 'HSMGetAllTasksStatusesVDS' method
2018-11-18 14:56:01,510+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM h
ost_mixed_3 command HSMGetAllTasksStatusesVDS failed: Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800cc5f58 err=[\'  Logical Volume "f14f759c-42dd
-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']'
2018-11-18 14:56:01,511+02 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] SPMAsyncTask::PollTask: Polling task '1792442e-5dec-4fec-9bf4-ddad44957aa5' 
(Parent Command 'CreateImagePlaceholder', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'.
2018-11-18 14:56:01,516+02 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] BaseAsyncTask::logEndTaskFailure: Task '1792442e-5dec-4fec-9bf4-ddad44957aa5
' (Parent Command 'CreateImagePlaceholder', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure:
-- Result: 'cleanSuccess'
-- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800c
c5f58 err=[\'  Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']', code = 550',
-- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b0980
0cc5f58 err=[\'  Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']', code = 550'
2018-11-18 14:56:01,517+02 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] CommandAsyncTask::endActionIfNecessary: All tasks of command 'c9048d09-d
79e-4652-8522-3456e2599c75' has ended -> executing 'endAction'
2018-11-18 14:56:01,517+02 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] CommandAsyncTask::endAction: Ending action for '1' tasks (command ID: 'c
9048d09-d79e-4652-8522-3456e2599c75'): calling endAction '.
2018-11-18 14:56:01,519+02 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-3855) [] CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAc
tion 'CreateImagePlaceholder',
2018-11-18 14:56:01,532+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.CreateImagePlaceholderCommand] (EE-ManagedThreadFactory-engine-Thread-3855) [7a76d927-bc21-4384-9520-4b2d38137b15] Ending command 'org.ovir
t.engine.core.bll.storage.lsm.CreateImagePlaceholderCommand' with failure.



VDSM host_mixed_3:
2018-11-18 14:55:51,951+0200 DEBUG (tasks/5) [storage.Misc.excCmd] FAILED: <err> = '  Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\n
'; <rc> = 5 (commands:89)
2018-11-18 14:55:51,952+0200 ERROR (tasks/5) [storage.Volume] Failed to create volume /rhev/data-center/mnt/blockSD/02b98d9e-82d7-468c-9456-d8fcb00159e6/images/0f98c097-f1fd-4dc9-bd99-c9417321eb20/f14f759c-42dd-4843-9b61-b09800cc5f58: Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800cc5f58 err=[\'  Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']' (volume:1208)
2018-11-18 14:55:51,952+0200 ERROR (tasks/5) [storage.Volume] Unexpected error (volume:1245)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1205, in create
    initialSize=initialSize)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 507, in _create
    initialTags=(sc.TAG_VOL_UNINIT,))
  File "/usr/lib/python2.7/site-packages/vdsm/storage/lvm.py", line 1161, in createLV
    raise se.CannotCreateLogicalVolume(vgName, lvName, err)
CannotCreateLogicalVolume: Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800cc5f58 err=[\'  Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']'
2018-11-18 14:55:51,953+0200 ERROR (tasks/5) [storage.Image] Unexpected error (image:422)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 405, in _createTargetImage
    srcVolUUID=volParams['parent'])
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 816, in createVolume
    initialSize=initialSize)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1205, in create
    initialSize=initialSize)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 507, in _create
    initialTags=(sc.TAG_VOL_UNINIT,))
  File "/usr/lib/python2.7/site-packages/vdsm/storage/lvm.py", line 1161, in createLV
    raise se.CannotCreateLogicalVolume(vgName, lvName, err)
CannotCreateLogicalVolume: Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800cc5f58 err=[\'  Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']'


Version-Release number of selected component (if applicable):
ovirt-engine- 4.3.0-0.0.master.20181008123014
VDSM - 4.30.1-25.gitce9e416.el7

How reproducible:
Happened manually twice so far (100%)

Steps to Reproduce:
1. Create VM with disks, OS and write data
2. LSM nfs disk to iscsi domain -> block from SPM host(iptables) of source(nfs) domain during host Non Operational - LSM - fails eventually .
3. Unblock(iptables) of source(nfs) domain - host operational again
4. Run same LSM again - fails CreateImagePlaceholder with :
cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b0980
0cc5f58 err=[\'  Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']', code = 550'

Actual results:
2nd LSM failed 

Expected results:


Additional info:

Comment 1 Avihai 2018-11-18 13:36:07 UTC
Trying to verify Bug 1565685 I encountered this bug.

As the scenario of this bug and Bug 1565685 are the same I marked this bug as a blocker for Bug 1565685 verification.

Comment 2 Sandro Bonazzola 2019-01-28 09:34:41 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 3 Benny Zlotnik 2019-05-28 12:09:59 UTC
As seen in the logs, there was an attempt to cleanup, but it failed because vdsm failed to respond because it had no access to the storage:
2018-11-18 14:49:58,449+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-76) [f87e17d7-aee9-4ed6-a914-a1241b7917c5] Command 'VmReplicateDiskFinishVDSCommand(HostName = host_mixed_1, VmReplicateDiskParameters:{hostId='640c82d8-d796-4056-91df-71de40894752', vmId='543a221d-896f-4889-b8ad-c979025fbb00', storagePoolId='0bbd940f-5c1a-4a04-b936-e76fe5ee71d8', srcStorageDomainId='1d1674d8-2d91-4067-bd6a-2008979532ad', targetStorageDomainId='1d1674d8-2d91-4067-bd6a-2008979532ad', imageGroupId='0f98c097-f1fd-4dc9-bd99-c9417321eb20', imageId='27b76597-af4a-449e-8097-731595c10922'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues

There is not much we can do in these cases and manual cleanup has to be performed.
Setting as duplicate of bugzilla 1611929

*** This bug has been marked as a duplicate of bug 1611929 ***


Note You need to log in before you can comment on or make changes to this bug.