Bug 1507994
| Summary: | Failed to remove single disk from snapshot while vm is running | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [oVirt] ovirt-engine | Reporter: | Kevin Alon Goldblatt <kgoldbla> | ||||
| Component: | BLL.Storage | Assignee: | Ala Hino <ahino> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Kevin Alon Goldblatt <kgoldbla> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.2.0 | CC: | amureini, bugs, kgoldbla | ||||
| Target Milestone: | ovirt-4.2.0 | Flags: | rule-engine:
ovirt-4.2+
|
||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | storage | ||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2017-12-20 11:16:34 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1481688 | ||||||
| Attachments: |
|
||||||
Created attachment 1346001 [details]
vdsm, server, engine logs
Added logs
The log is full of domain monitoring errors. Does this happen on non HCI environments? On other HCI environments? (In reply to Allon Mureinik from comment #2) > The log is full of domain monitoring errors. > > Does this happen on non HCI environments? On other HCI environments? It works on regular environment @Kevin,
Did you perform any disk move?
Following vdsm error shows that disk replication fails to start:
2017-10-31 17:20:33,898+0200 ERROR (jsonrpc/7) [virt.vm] (vmId='7330e919-62bc-4e3f-990b-111e2fcbf87d') Unable to start replication for sda to {u'domainID': u'01bdd2ad-f506-4e4d-b1c5-e037a163
0821', 'volumeInfo': {'path': u'glust1_rep3/01bdd2ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/2b886bc8-cece-470b-950a-0c3b8da5315f', 'protocol': 'gluster', 'ho
sts': [{'port': '0', 'transport': 'tcp', 'name': '10.35.83.240'}, {'port': '0', 'transport': 'tcp', 'name': '10.35.83.241'}, {'port': '0', 'transport': 'tcp', 'name': '10.35.83.242'}]}, 'for
mat': 'cow', u'poolID': u'96cc313a-67d4-49e8-8979-27d3fcfd171c', u'device': 'disk', 'protocol': 'gluster', 'propagateErrors': 'off', u'diskType': u'network', 'cache': 'none', u'volumeID': u'
2b886bc8-cece-470b-950a-0c3b8da5315f', u'imageID': u'db4120d3-da1c-4297-ab67-8c53c2bbe62f', 'hosts': [{'port': '0', 'transport': 'tcp', 'name': '10.35.83.240'}], 'path': u'glust1_rep3/01bdd2
ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/2b886bc8-cece-470b-950a-0c3b8da5315f', 'volumeChain': [{'domainID': u'01bdd2ad-f506-4e4d-b1c5-e037a1630821', 'lease
Offset': 0, 'path': u'glust1_rep3/01bdd2ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/4e01405c-80fd-40d4-b5aa-57ec08f23dc1', 'volumeID': u'4e01405c-80fd-40d4-b5a
a-57ec08f23dc1', 'leasePath': u'/rhev/data-center/mnt/glusterSD/10.35.83.240:_glust1__rep3/01bdd2ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/4e01405c-80fd-40d4
-b5aa-57ec08f23dc1.lease', 'imageID': u'db4120d3-da1c-4297-ab67-8c53c2bbe62f'}, {'domainID': u'01bdd2ad-f506-4e4d-b1c5-e037a1630821', 'leaseOffset': 0, 'path': u'glust1_rep3/01bdd2ad-f506-4e
4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/2b886bc8-cece-470b-950a-0c3b8da5315f', 'volumeID': u'2b886bc8-cece-470b-950a-0c3b8da5315f', 'leasePath': u'/rhev/data-center/
mnt/glusterSD/10.35.83.240:_glust1__rep3/01bdd2ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/2b886bc8-cece-470b-950a-0c3b8da5315f.lease', 'imageID': u'db4120d3-d
a1c-4297-ab67-8c53c2bbe62f'}, {'domainID': u'01bdd2ad-f506-4e4d-b1c5-e037a1630821', 'leaseOffset': 0, 'path': u'glust1_rep3/01bdd2ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab6
7-8c53c2bbe62f/533d83bc-1473-4bc3-bb3f-41254eb786c0', 'volumeID': u'533d83bc-1473-4bc3-bb3f-41254eb786c0', 'leasePath': u'/rhev/data-center/mnt/glusterSD/10.35.83.240:_glust1__rep3/01bdd2ad-
f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/533d83bc-1473-4bc3-bb3f-41254eb786c0.lease', 'imageID': u'db4120d3-da1c-4297-ab67-8c53c2bbe62f'}, {'domainID': u'01bdd
2ad-f506-4e4d-b1c5-e037a1630821', 'leaseOffset': 0, 'path': u'glust1_rep3/01bdd2ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/2b0b5b2f-a015-4593-9e59-e22667b9009
c', 'volumeID': u'2b0b5b2f-a015-4593-9e59-e22667b9009c', 'leasePath': u'/rhev/data-center/mnt/glusterSD/10.35.83.240:_glust1__rep3/01bdd2ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4
297-ab67-8c53c2bbe62f/2b0b5b2f-a015-4593-9e59-e22667b9009c.lease', 'imageID': u'db4120d3-da1c-4297-ab67-8c53c2bbe62f'}, {'domainID': u'01bdd2ad-f506-4e4d-b1c5-e037a1630821', 'leaseOffset': 0
, 'path': u'glust1_rep3/01bdd2ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/d566b684-1499-4afc-b74d-405102591aee', 'volumeID': u'd566b684-1499-4afc-b74d-40510259
1aee', 'leasePath': u'/rhev/data-center/mnt/glusterSD/10.35.83.240:_glust1__rep3/01bdd2ad-f506-4e4d-b1c5-e037a1630821/images/db4120d3-da1c-4297-ab67-8c53c2bbe62f/d566b684-1499-4afc-b74d-4051
02591aee.lease', 'imageID': u'db4120d3-da1c-4297-ab67-8c53c2bbe62f'}]} (vm:4356)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4350, in diskReplicateStart
self._startDriveReplication(drive)
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4479, in _startDriveReplication
self._dom.blockCopy(drive.name, destxml, flags=flags)
File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 125, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 586, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 688, in blockCopy
if ret == -1: raise libvirtError ('virDomainBlockCopy() failed', dom=self)
libvirtError: argument unsupported: non-file destination not supported yet
(In reply to Ala Hino from comment #4) > @Kevin, > > Did you perform any disk move? > Following vdsm error shows that disk replication fails to start: This underlying issue was solved by Denis' 2c03d2307b8da5fb6fc6eac026ea554beefc2dc2 merged on October 30th (which was not available when this bug was reported) Please retest with a newer engine build that contains the fix. Verified with the following code: ---------------------------------------------- ovirt-engine-4.2.0-0.5.master.el7.noarch vdsm-4.20.7-55.git11440d6.el7.centos.x86_64 Verified with the following scenario: ---------------------------------------------- Steps to Reproduce: 1. Create a vm with 2 disks, create snapshot 2. Start vm, write data, create 2nd snapshot 3. Write data, create 3rd snapshot 4. Via Webadmin->Storage Domains->Hyperconv Gluster domain->Disk Snapshots->select snapshot disk from 2nd snapshot and press Remove >>>>> The disk snapshot is successfully removed This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |
Description of problem: Deleting a single disk from a snapshot with vm up fails Version-Release number of selected component (if applicable): ovirt-engine-4.2.0-0.0.master.20171029154613.git19686f3.el7.centos.noarch vdsm-4.20.5-5.git3d47978.el7.centos.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create a vm with 2 disks, create snapshot 2. Start vm, write data, create 2nd snapshot 3. Write data, create 3rd snapshot 4. Via Webadmin->Storage Domains->Hyperconv Gluster domain->Disk Snapshots->select snapshot disk from 2nd snapshot and press Remove >>>>> Fails with "Failed to complete deletion of Disk 'clonefromtp1_Disk1' from snapshot(s) 'snaa3' of VM 'clonefromtp1' (User: admin@internal-authz)" Actual results: Fail to delete disk snapshot from live vm Expected results: Should work Additional info: From engine logs: ------------------------- 2017-10-31 17:55:22,727+02 ERROR [org.ovirt.engine.core.bll.MergeStatusCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-4) [a527abf0-ae22-4083-ac96-ee3d8cbcce3a] Failed to live merge, still in volume chain: [4e01405c-80fd-40d4-b5aa-57ec08f23dc1, d566b684-1499-4afc-b74d-405102591aee] 2017-10-31 17:55:23,395+02 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler5) [5cf39523] START, GlusterServersListVDSCommand(HostName = vm-83-164.scl.lab.tlv.redhat.com, VdsIdVDSCommandParametersBase:{hostId='265056d7-a5dc-4e16-baae-2db73c51e714'}), log id: 651e0bc7 2017-10-31 17:55:23,581+02 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-56) [a527abf0-ae22-4083-ac96-ee3d8cbcce3a] Command id: '74055a27-4935-446d-a7ea-75855cd8209d failed child command status for step 'MERGE_STATUS' 2017-10-31 17:55:23,581+02 INFO [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-56) [a527abf0-ae22-4083-ac96-ee3d8cbcce3a] Command 'RemoveSnapshotSingleDiskLive' id: '74055a27-4935-446d-a7ea-75855cd8209d' child commands '[9d2a47b9-6d00-4a72-9178-890d415345fc, cb9a7cff-a704-4499-b344-052a54456ca6, fe05e4e0-bb07-476a-9bb7-dd58bc8f52de]' executions were completed, status 'FAILED' 2017-10-31 17:55:24,393+02 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler5) [5cf39523] FINISH, GlusterServersListVDSCommand, return: [10.35.83.164/23:CONNECTED, vm-83-162.scl.lab.tlv.redhat.com:CONNECTED, vm-83-163.scl.lab.tlv.redhat.com:CONNECTED], log id: 651e0bc7 2017-10-31 17:55:24,403+02 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler5) [5cf39523] START, GlusterVolumesListVDSCommand(HostName = vm-83-164.scl.lab.tlv.redhat.com, GlusterVolumesListVDSParameters:{hostId='265056d7-a5dc-4e16-baae-2db73c51e714'}), log id: 2f8d8eea 2017-10-31 17:55:24,679+02 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-97) [a527abf0-ae22-4083-ac96-ee3d8cbcce3a] Merging of snapshot 'beaa2b2e-a8dc-4b58-ac0b-a29eaebafe35' images '4e01405c-80fd-40d4-b5aa-57ec08f23dc1'..'d566b684-1499-4afc-b74d-405102591aee' failed. Images have been marked illegal and can no longer be previewed or reverted to. Please retry Live Merge on the snapshot to complete the operation. 2017-10-31 17:55:24,719+02 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-97) [a527abf0-ae22-4083-ac96-ee3d8cbcce3a] Ending command 'org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand' with failure. 2017-10-31 17:55:24,806+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-97) [a527abf0-ae22-4083-ac96-ee3d8cbcce3a] Command 'RemoveDiskSnapshots' id: '99cd1c4d-09cf-4279-a571-a431afd18126' child commands '[74055a27-4935-446d-a7ea-75855cd8209d]' executions were completed, status 'FAILED' 2017-10-31 17:55:24,953+02 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler5) [5cf39523] FINISH, GlusterVolumesListVDSCommand, return: {12548706-4031-485f-b19f-8c0d651add1f=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@6dbb9f9a}, log id: 2f8d8eea 2017-10-31 17:55:25,959+02 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveDiskSnapshotsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-89) [a527abf0-ae22-4083-ac96-ee3d8cbcce3a] Ending command 'org.ovirt.engine.core.bll.snapshots.RemoveDiskSnapshotsCommand' with failure. 2017-10-31 17:55:26,270+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-89) [a527abf0-ae22-4083-ac96-ee3d8cbcce3a] EVENT_ID: USER_REMOVE_DISK_SNAPSHOT_FINISHED_FAILURE(376), Failed to complete deletion of Disk 'clonefromtp1_Disk1' from snapshot(s) 'snaa3' of VM 'clonefromtp1' (User: admin@internal-authz). 2017-10-31 17:55:29,989+02 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler4) [340c9e07] START, GlusterServersListVDSCommand(HostName = vm-83-164.scl.lab.tlv.redhat.com, VdsIdVDSCommandParametersBase:{hostId='265056d7-a5dc-4e16-baae-2db73c51e714'}), log id: 4fa79dac 2017-10-31 17:55:31,030+02 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler4) [340c9e07] FINISH, GlusterServersListVDSCommand, return: [10.35.83.164/23:CONNECTED, vm-83-162.scl.lab.tlv.redhat.com:CONNECTED, vm-83-163.scl.lab.tlv.redhat.com:CONNECTED], log id: 4fa79dac