Bug 1497355

Summary: Live Storage Migration continued on after snapshot creation hung and timed out
Product: Red Hat Enterprise Virtualization Manager Reporter: Gordon Watson <gwatson>
Component: ovirt-engineAssignee: Benny Zlotnik <bzlotnik>
Status: CLOSED ERRATA QA Contact: Yosi Ben Shimon <ybenshim>
Severity: high Docs Contact:
Priority: medium    
Version: 4.1.6CC: bzlotnik, ebenahar, jcoscia, kshukla, lsurette, lveyde, mjankula, Rhev-m-bugs, sborella, srevivo, stefano.stagnaro, tnisan
Target Milestone: ovirt-4.3.0Keywords: ZStream
Target Release: 4.3.0Flags: lsvaty: testing_plan_complete-
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: ovirt-engine-4.3.0_alpha Doc Type: If docs needed, set a value
Doc Text:
This release ensures the live storage migration process completes properly after creating a snapshot.
Story Points: ---
Clone Of:
: 1585039 (view as bug list) Environment:
Last Closed: 2019-05-08 12:36:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1585039    

Description Gordon Watson 2017-09-29 21:53:56 UTC
Description of problem:

The snapshot creation of a Live Storage Migration hung and timed out on the engine side. However, the engine then continued on with CloneImageGroupStructureVDSCommand and VmReplicateDiskStartVDSCommand, etc.

The result was that the LSM effectively failed, with the disk still residing in the source storage domain. However, volumes were created in the target storage domain, which caused a subsequent LSM to fail.


Version-Release number of selected component (if applicable):

RHV 4.1.6
RHVH 4.1.6
  vdsm-4.19.31-1.el7ev
  

How reproducible:

Not.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Allon Mureinik 2017-10-01 09:28:45 UTC
Benny, can you take a look please?

Tentitively targetting for 4.2.
If there's something safe enough to backport to 4.1.z, we should do that, but I'm not commiting on such a fix unless we see what the upstream fix contains.

Comment 12 Elad 2018-08-21 12:34:48 UTC
Verify according to https://bugzilla.redhat.com/show_bug.cgi?id=1585039#c15

Comment 13 Yosi Ben Shimon 2018-10-16 14:57:08 UTC
Verified using:
ovirt-engine-4.3.0-0.0.master.20181012165724.gitd25f971.el7.noarch
vdsm-4.30.0-640.git6fd8327.el7.x86_64

I blocked the connection between the host (host_mixed_3) to the destination storage domain (iscsi_2).

*** Engine log:


2018-10-16 15:07:46,561+03 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(default task-104) [14760f4a-6ff7-4c64-b713-0655f017afd0] EVENT_ID: USER_CREATE_SNAPSHOT(45), Snapshot '
yosi_test_Disk1 Auto-generated for Live Storage Migration' creation for VM 'yosi_test' was initiated by 
admin@internal-authz.

*** vdsm log:


2018-10-16 15:08:00,341+0300 INFO  (jsonrpc/3) [api.virt] START snapshot(snapDrives=[{u'baseVolumeID': u'6b5e067d-2520-4fcf-8e48-b42bec82c8ed', u'domainID': u'ca51be15-f214-4955-b9db-c7772c900104', u'volumeID': u'15149008-1995-42fc-b809-c0044f6f43aa', u'imageID': u'cd9af767-fe3e-4e91-b23d-b049ade7df23'}], snapMemory=None, frozen=True) from=::ffff:10.35.162.7,49886, flow_id=14760f4a-6ff7-4c64-b713-0655f017afd0, vmId=be44bd67-a80b-4f4c-8c58-4bc7554b2005 (api:48)

2018-10-16 15:08:35,470+0300 INFO  (jsonrpc/3) [api.virt] FINISH snapshot return={'status': {'message': '
Snapshot failed', 'code': 48}} from=::ffff:10.35.162.7,49886, flow_id=14760f4a-6ff7-4c64-b713-0655f017afd
0, vmId=be44bd67-a80b-4f4c-8c58-4bc7554b2005 (api:54)
2018-10-16 15:08:35,471+0300 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call VM.snapshot failed (error
 48) in 35.13 seconds (__init__:312)


*** Engine log (end command):


2018-10-16 15:08:36,625+03 ERROR [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [14760f4a-6ff7-4c64-b713-0655f017afd0] Ending command 'org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand' with failure.


Verified upstream.

Comment 15 errata-xmlrpc 2019-05-08 12:36:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:1085

Comment 16 Daniel Gur 2019-08-28 13:13:06 UTC
sync2jira

Comment 17 Daniel Gur 2019-08-28 13:17:18 UTC
sync2jira