Bug 1434105

Summary: Live storage migration failed
Product: [oVirt] ovirt-engine Reporter: SATHEESARAN <sasundar>
Component: BLL.StorageAssignee: Benny Zlotnik <bzlotnik>
Status: CLOSED WONTFIX QA Contact: SATHEESARAN <sasundar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.1.4CC: alitke, bill.james, bugs, nsoffer, sabose, sasundar, tnisan
Target Milestone: ---Keywords: Reopened
Target Release: ---Flags: sbonazzo: ovirt-4.3-
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-02 08:51:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1312909    
Bug Blocks: 1277939, 1433896    
Attachments:
Description Flags
compressed archived vdsm.log
none
compressed archived engine.log
none
vdsm.log from host6 which ran the VM none

Description SATHEESARAN 2017-03-20 17:49:48 UTC
Description of problem:
------------------------
I have 2 glusterfs data domain backed with sharded replica 3 gluster volumes ( Domain_1 and domain_3 ). I had a VM with its disk on Domain_1. When tried to move the disk ( with the VM active ), from one domain ( Domain_1 ) to another domain ( Domain_3), I observed an error while performing live storage migration

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHV-4.1.1-5

How reproducible:
-----------------
Always

Steps to Reproduce:
--------------------
1. Create 2 storage glusterfs data domain backed with sharded replica 3 gluster volumes
2. Create a VM with its root disk from domain_1
3. Install the VM with OS and perform some I/O inside the VM
4. Migrate the disk from one domain to another

Actual results:
---------------
live storage migration failed
The events says this - "User admin@internal-authz have failed to move disk set2_appvm51_Disk1 to domain Domain_3."

Expected results:
------------------
Disk should get migrated to another domain successfully

Comment 1 SATHEESARAN 2017-03-20 18:10:44 UTC
Created attachment 1264818 [details]
compressed archived vdsm.log

Comment 2 SATHEESARAN 2017-03-20 18:11:37 UTC
Created attachment 1264819 [details]
compressed archived engine.log

Comment 3 SATHEESARAN 2017-03-21 10:31:45 UTC
I could see that the VM was running on host6 and I see the following errors with vdsm.log on host6

# xzgrep ERROR /var/log/vdsm/vdsm.log.1.xz 
2017-03-20 08:54:51,213+0530 ERROR (monitor/1025388) [storage.Monitor] Error checking domain 10253882-040a-4f28-a6b6-3c2015fb7594 (monitor:426)
2017-03-20 08:54:51,250+0530 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/10.70.36.73:_engine/10253882-040a-4f28-a6b6-3c2015fb7594/dom_md/metadata (monitor:485)
2017-03-20 08:54:55,631+0530 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/10.70.36.73:_engine/10253882-040a-4f28-a6b6-3c2015fb7594/dom_md/metadata (monitor:485)
2017-03-20 08:55:01,224+0530 ERROR (monitor/1025388) [storage.Monitor] Error checking domain 10253882-040a-4f28-a6b6-3c2015fb7594 (monitor:426)
2017-03-20 18:58:52,706+0530 ERROR (jsonrpc/3) [virt.vm] (vmId='226433a3-4a34-46d6-91b6-2a679c394ec9') Replication job unfinished (drive: 'sda', srcDisk: {u'device': u'disk', u'poolID': u'00000001-0001-0001-0001-000000000311', u'volumeID': u'80c9f20c-3dd7-466b-808d-62337f6cc153', u'domainID': u'4828a460-1825-4ba9-b456-64afad078198', u'imageID': u'da3c126e-d848-4b2e-b1fe-eb9c4cf6b04d'}, job: {'end': 5059117056L, 'bandwidth': 0L, 'type': 2, 'cur': 5040308224L}) (vm:3753)
2017-03-20 18:58:53,990+0530 ERROR (jsonrpc/1) [virt.vm] (vmId='226433a3-4a34-46d6-91b6-2a679c394ec9') Replication job unfinished (drive: 'sda', srcDisk: {u'device': u'disk', u'poolID': u'00000001-0001-0001-0001-000000000311', u'volumeID': u'80c9f20c-3dd7-466b-808d-62337f6cc153', u'domainID': u'4828a460-1825-4ba9-b456-64afad078198', u'imageID': u'da3c126e-d848-4b2e-b1fe-eb9c4cf6b04d'}, job: {'end': 5062262784L, 'bandwidth': 0L, 'type': 2, 'cur': 5061869568L}) (vm:3753)
2017-03-20 18:59:00,186+0530 ERROR (jsonrpc/0) [virt.vm] (vmId='226433a3-4a34-46d6-91b6-2a679c394ec9') Live merge failed (job: 140f8b05-293f-4347-a0d1-fcde17df1e38) (vm:4798)

Comment 4 SATHEESARAN 2017-03-21 10:32:28 UTC
Created attachment 1265006 [details]
vdsm.log from host6 which ran the VM

Comment 5 Adam Litke 2017-03-21 14:23:07 UTC
Seeing the following exception in the log.  This seems to indicate a problem on the gluster side.  Can you check to make sure your gluster storage wasn't reporting errors?  I am not really sure where to look for that but something has caused the gluster mount to become disconnected.

2017-03-20 08:54:55,631+0530 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/10.70.36.73:_engine/10253882-040a-4f28-a6b6-3c2015fb7594/dom_md/metadata (monitor:485)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/monitor.py", line 483, in _pathChecked
    delay = result.delay()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 368, in delay
    raise exception.MiscFileReadException(self.path, self.rc, self.err)
MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/glusterSD/10.70.36.73:_engine/10253882-040a-4f28-a6b6-3c2015fb7594/dom_md/metadata', 1, bytearray(b"/usr/bin/dd: failed to open \'/rhev/data-center/mnt/glusterSD/10.70.36.73:_engine/10253882-040a-4f28-a6b6-3c2015fb7594/dom_md/metadata\': Transport endpoint is not connected\n"))

Comment 6 Sahina Bose 2017-03-22 05:15:13 UTC
(In reply to Adam Litke from comment #5)
> Seeing the following exception in the log.  This seems to indicate a problem
> on the gluster side.  Can you check to make sure your gluster storage wasn't
> reporting errors?  I am not really sure where to look for that but something
> has caused the gluster mount to become disconnected.
> 
> 2017-03-20 08:54:55,631+0530 ERROR (check/loop) [storage.Monitor] Error
> checking path
> /rhev/data-center/mnt/glusterSD/10.70.36.73:_engine/10253882-040a-4f28-a6b6-
> 3c2015fb7594/dom_md/metadata (monitor:485)
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/monitor.py", line 483, in _pathChecked
>     delay = result.delay()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 368,
> in delay
>     raise exception.MiscFileReadException(self.path, self.rc, self.err)
> MiscFileReadException: Internal file read failure:
> (u'/rhev/data-center/mnt/glusterSD/10.70.36.73:_engine/10253882-040a-4f28-
> a6b6-3c2015fb7594/dom_md/metadata', 1, bytearray(b"/usr/bin/dd: failed to
> open
> \'/rhev/data-center/mnt/glusterSD/10.70.36.73:_engine/10253882-040a-4f28-
> a6b6-3c2015fb7594/dom_md/metadata\': Transport endpoint is not connected\n"))

This error is from the HE gluster volume (HE storage domain). Does that affect the live migration on the other storage domains?

Comment 7 Adam Litke 2017-03-28 18:47:53 UTC
Sahina you are correct.  Those monitoring errors are unrelated.

Comment 8 Adam Litke 2017-03-28 19:12:02 UTC
After further analyzing the engine logs I am concerned that there could be a problem with the engine flow for LiveMigrateDiskCommand.  This command performs the following actions in sequence:

1. CreateImagePlaceholder - Clone the disk structure from the source domain to the destination domain.

2. VmReplicateDiskStart - Instruct qemu to replicate the active volume of the source disk to the active layer of the destination disk and mirror all writes to both the source and destination disks.  This operation is a two phase process.  First, data is copied from the source to the destination.  Once this converges, subsequent writes are sent to both the source and destination simultaneously. 

3. SyncImageGroupData - Sync the contents of source disk's internal volumes to the corresponding destination disk volumes.

--- wait for SyncImageGroupData to complete ---

4. VmReplicateDiskFinish - Instruct qemu to pivot to the destination disk and suspend write mirroring on the source disk.  This only works if step 2 has converged to the mirroring phase and step 3 is also complete.


What appears to be happening in the attached logs is that due to high I/O, step 2  is unable to converge before step 4 is executed.  This is usually not a problem since step 3 can take a relatively long time and with low I/O step 2 will converge quickly.

The following log entry confirms that step 2 did not converge (cur != end):

2017-03-20 18:58:52,706+0530 ERROR (jsonrpc/3) [virt.vm] (vmId='226433a3-4a34-46d6-91b6-2a679c394ec9') Replication job unfinished (drive: 'sda', srcDisk: {u'device': u'disk', u'poolID': u'00000001-0001-0001-0001-000000000311', u'volumeID': u'80c9f20c-3dd7-466b-808d-62337f6cc153', u'domainID': u'4828a460-1825-4ba9-b456-64afad078198', u'imageID': u'da3c126e-d848-4b2e-b1fe-eb9c4cf6b04d'}, job: {'end': 5059117056L, 'bandwidth': 0L, 'type': 2, 'cur': 5040308224L}) (vm:3753)


I believe the engine needs to add an additional step before VmReplicateDiskFinish which checks that the replication has converged.  Alternatively, we could wait and retry VmReplicateDiskFinish a number of times before failing the entire operation.

Comment 9 Adam Litke 2017-03-28 19:36:00 UTC
I just talked to Liron and this is a known issue and already has an assigned bug.

*** This bug has been marked as a duplicate of bug 1312909 ***

Comment 12 Sahina Bose 2017-04-17 07:11:57 UTC
*** Bug 1440198 has been marked as a duplicate of this bug. ***

Comment 14 Tal Nisan 2018-07-02 08:51:29 UTC
The whole LSM flow was changed, please reopen if the issue still exists.