Bug 1489361
Summary: | [gfapi] Storage migration fails in VmReplicateDiskFinishVDSCommand stage w/ gfapi access | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Sahina Bose <sabose> |
Component: | rhhi | Assignee: | Sahina Bose <sabose> |
Status: | CLOSED DEFERRED | QA Contact: | SATHEESARAN <sasundar> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rhhi-1.1 | CC: | bugs, derez, fromani, godas, knarra, nsoffer, rcyriac, rhs-bugs, sabose, sasundar, tnisan |
Target Milestone: | --- | Keywords: | Automation, Tracking |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1481688 | Environment: | |
Last Closed: | 2020-08-01 02:53:17 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Gluster | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1481688 | ||
Bug Blocks: |
Description
Sahina Bose
2017-09-07 09:26:03 UTC
Hi, I have tested LSM with both source and target domains to be glusterfs with libgfapi enabled and i see that it fails with following errors in engine and vdsm logs. engine.log: ================================== 2017-08-31 01:50:00,390-04 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-28) [b9830c2e-c16a-485f-b91b-45058698f0aa] CommandAsyncTa sk::HandleEndActionResult [within thread]: Removing CommandMultiAsyncTasks object for entity '3c1e7646-5a69-49cb-862f-2ee431fd2961' 2017-08-31 01:50:01,879-04 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler8) [7097b76a] FINISH, GlusterServersListVDSCom mand, return: [10.70.36.78/23:CONNECTED, 10.70.36.76:CONNECTED, 10.70.36.77:CONNECTED], log id: 62ba0d5f 2017-08-31 01:50:01,891-04 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler8) [7097b76a] START, GlusterVolumesListVDSComm and(HostName = yarrow.lab.eng.blr.redhat.com, GlusterVolumesListVDSParameters:{runAsync='true', hostId='11972b68-a720-4236-bde0-3c066ceff3b6'}), log id: 2ddaf087 2017-08-31 01:50:02,095-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler4) [b9830c2e-c16a-485f-b91b-45058698f0aa] Command 'L iveMigrateVmDisks' (id: '7a857239-50a8-412a-922d-09228065bcda') waiting on child command id: '1b57116f-ba8f-40a9-b0d9-df4f4bfec93f' type:'LiveMigrateDisk' to complete 2017-08-31 01:50:03,116-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] START, VmReplicateDiskStartVDSCommand(HostName = zod.lab.eng.blr.redhat.com, VmReplicateDiskParameters:{runAsync='true', hostId='6a90a343-10e4-4c5c-afc7-e5477594d2b6', vmId ='b6da275e-649b-470d-ac32-89738faa3be5', storagePoolId='59a5429a-0236-034c-00d4-000000000250', srcStorageDomainId='d58a1d51-da25-454e-9edd-9754e419d668', targetStorageDomain Id='1e67ccf5-8331-4e69-a0f6-c5ed58381217', imageGroupId='908c0355-ceb8-41c8-a644-5a551040d914', imageId='209455eb-4b82-417e-97ea-d0bab2fce721'}), log id: 59c08196 2017-08-31 01:50:03,147-04 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler8) [7097b76a] FINISH, GlusterVolumesListVDSCom mand, return: {9eae9600-7185-4fbb-a42c-bbe9c6eb86ea=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@9669622b, 0763bd54-ea1c-4707-8abb-3d80d62df20f= org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7aaadd95, 0dccff81-2ffa-4876-b4c3-15930fab2c65=org.ovirt.engine.core.common.businessentities.gluste r.GlusterVolumeEntity@844a93e1}, log id: 2ddaf087 2017-08-31 01:50:04,367-04 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] Failed in 'VmReplicateDiskStartVDS' method 2017-08-31 01:50:04,377-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM zod.lab.eng.blr.redhat.com command VmReplicateDiskStartVDS failed: Drive replication error 2017-08-31 01:50:04,377-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand' return value 'StatusOnlyReturn [status=Status [code=55, message=Drive replication error]]' 2017-08-31 01:50:04,377-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] HostName = zod.lab.eng.blr.redhat.com 2017-08-31 01:50:04,377-04 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] Command 'VmReplicateDiskStartVDSCommand(HostName = zod.lab.eng.blr.redhat.com, VmReplicateDiskParameters:{runAsync='true', hostId='6a90a343-10e4-4c5c-afc7-e5477594d2b6', vmId='b6da275e-649b-470d-ac32-89738faa3be5', storagePoolId='59a5429a-0236-034c-00d4-000000000250', srcStorageDomainId='d58a1d51-da25-454e-9edd-9754e419d668', targetStorageDomainId='1e67ccf5-8331-4e69-a0f6-c5ed58381217', imageGroupId='908c0355-ceb8-41c8-a644-5a551040d914', imageId='209455eb-4b82-417e-97ea-d0bab2fce721'})' execution failed: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskStartVDS, error = Drive replication error, code = 55 2017-08-31 01:50:04,377-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] FINISH, VmReplicateDiskStartVDSCommand, log id: 59c08196 2017-08-31 01:50:04,377-04 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] Failed VmReplicateDiskStart (Disk '908c0355-ceb8-41c8-a644-5a551040d914' , VM 'b6da275e-649b-470d-ac32-89738faa3be5') 2017-08-31 01:50:04,378-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] Command 'LiveMigrateDisk' id: '1b57116f-ba8f-40a9-b0d9-df4f4bfec93f' failed when attempting to perform the next operation, marking as FAILED '[3c1e7646-5a69-49cb-862f-2ee431fd2961]' 2017-08-31 01:50:04,378-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler10) [b9830c2e-c16a-485f-b91b-45058698f0aa] Command 'LiveMigrateDisk' id: '1b57116f-ba8f-40a9-b0d9-df4f4bfec93f' child commands '[3c1e7646-5a69-49cb-862f-2ee431fd2961]' executions were completed, status 'FAILED' 2017-08-31 01:50:05,400-04 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (DefaultQuartzScheduler7) [b9830c2e-c16a-485f-b91b-45058698f0aa] Ending command 'org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand' with failure. 2017-08-31 01:50:05,400-04 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (DefaultQuartzScheduler7) [b9830c2e-c16a-485f-b91b-45058698f0aa] Failed durin vdsm.log file: ================================================== fce721', 'volumeChain': [{'domainID': u'1e67ccf5-8331-4e69-a0f6-c5ed58381217', 'leaseOffset': 0, 'path': u'data/1e67ccf5-8331-4e69-a0f6-c5ed58381217/images/908c0355-ceb8-41c 8-a644-5a551040d914/209455eb-4b82-417e-97ea-d0bab2fce721', 'volumeID': u'209455eb-4b82-417e-97ea-d0bab2fce721', 'leasePath': u'/rhev/data-center/mnt/glusterSD/10.70.36.78:_d ata/1e67ccf5-8331-4e69-a0f6-c5ed58381217/images/908c0355-ceb8-41c8-a644-5a551040d914/209455eb-4b82-417e-97ea-d0bab2fce721.lease', 'imageID': u'908c0355-ceb8-41c8-a644-5a5510 40d914'}, {'domainID': u'1e67ccf5-8331-4e69-a0f6-c5ed58381217', 'leaseOffset': 0, 'path': u'data/1e67ccf5-8331-4e69-a0f6-c5ed58381217/images/908c0355-ceb8-41c8-a644-5a551040 d914/f2380cbb-1e16-452e-b597-2066c979cbc2', 'volumeID': u'f2380cbb-1e16-452e-b597-2066c979cbc2', 'leasePath': u'/rhev/data-center/mnt/glusterSD/10.70.36.78:_data/1e67ccf5-83 31-4e69-a0f6-c5ed58381217/images/908c0355-ceb8-41c8-a644-5a551040d914/f2380cbb-1e16-452e-b597-2066c979cbc2.lease', 'imageID': u'908c0355-ceb8-41c8-a644-5a551040d914'}]} (vm: 3739) Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 3727, in diskReplicateStart if utils.isBlockDevice(replica['path']): File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 96, in isBlockDevice return stat.S_ISBLK(os.stat(path).st_mode) OSError: [Errno 2] No such file or directory: '/data/1e67ccf5-8331-4e69-a0f6-c5ed58381217/images/908c0355-ceb8-41c8-a644-5a551040d914/209455eb-4b82-417e-97ea-d0bab2fce721' 2017-08-31 11:20:03,486+0530 INFO (jsonrpc/2) [vdsm.api] FINISH diskReplicateStart return={'status': {'message': 'Drive replication error', 'code': 55}} from=::ffff:10.70.3 4.124,42020, flow_id=b9830c2e-c16a-485f-b91b-45058698f0aa (api:52) 2017-08-31 11:20:03,487+0530 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call VM.diskReplicateStart failed (error 55) in 0.37 seconds (__init__:539) 2017-08-31 11:20:03,710+0530 INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:53150 (protocoldetector:72) 2017-08-31 11:20:03,719+0530 INFO (Reactor thread) [ProtocolDetector.Detector] Detected protocol stomp from ::1:53150 (protocoldetector:127) Tested migrating disk from one storage domain to another using glusterfs with libgfapi enabled when the vm is offline and i see that it works fine. Engine and vdsm logs can be found in the link below: ============================================================ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1484070/ Sas, can you confirm if Live Storage migration fails if gfapi access is not turned on. If so, we will need to target this for RHHI 1.1. If not, we can move to next release Hi sahina, when gfapi access is not turned on, i.e with fuse access Live storage migration works as expected. Clearing need info on sas Thanks kasturi. Sahina, The dependent bug is CLOSED. We should also close this bug, if the issue is resolved Moving to ON_QA if issue has been resolved. Sahina, RHHI 2.0 doesn't support gfapi access mechanism. Availability of this bug in errata will make the user believe that gfapi access mechanism related fix has been made. Should we defer this bug from RHHI 2.0 and remove this bug from errata ? (In reply to SATHEESARAN from comment #9) > Sahina, > > RHHI 2.0 doesn't support gfapi access mechanism. > Availability of this bug in errata will make the user believe that gfapi > access mechanism related fix has been made. > > Should we defer this bug from RHHI 2.0 and remove this bug from errata ? Yes, we can, though the dependent bug in oVirt is closed. I'll move to 2.0.z when we qualify with gfapi access. (In reply to Sahina Bose from comment #10) > (In reply to SATHEESARAN from comment #9) > > Sahina, > > > > RHHI 2.0 doesn't support gfapi access mechanism. > > Availability of this bug in errata will make the user believe that gfapi > > access mechanism related fix has been made. > > > > Should we defer this bug from RHHI 2.0 and remove this bug from errata ? > > Yes, we can, though the dependent bug in oVirt is closed. > I'll move to 2.0.z when we qualify with gfapi access. There is no near term goal for RHHI to make use of gfapi. I'm closing this bug |