This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1459216 - [downstream clone - 4.1.3] Live Storage Migration sequence did not complete, SyncImage task failed on the SPM
[downstream clone - 4.1.3] Live Storage Migration sequence did not complete, ...
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
3.6.9
Unspecified Linux
high Severity high
: ovirt-4.1.3
: ---
Assigned To: Daniel Erez
Eyal Shenitzky
: ZStream
Depends On: 1443137
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-06 10:41 EDT by rhev-integ
Modified: 2017-06-12 13:20 EDT (History)
15 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1443137
Environment:
Last Closed: 2017-06-07 01:00:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3005271 None None None 2017-06-06 10:43 EDT

  None (edit)
Description rhev-integ 2017-06-06 10:41:44 EDT
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1443137 +++
======================================================================

Description of problem:

A Live Storage Migration 'hung'. The task handling the internal (base) volume copy on the SPM encountered an error, yet the 'copy' (qemu-img convert) appeared to have completed successfully. A subsequent "UnknownTask" error occurred while trying to stop/clear the task. 

On the engine, the SyncImageGroupDataVDSCommand completed without 'error', yet the LSM sequence just stopped, and so the VmReplicateDiskFinishVDSCommand was never executed. 

As a result, the symptoms were that the base volume copy seemed to have completed, but the active volume 'block copy' job was still in running.



Version-Release number of selected component (if applicable):

RHEV 3.6.9
RHEL 7.2 host;
   vdsm-4.17.35-1.el7
   libvirt-1.2.17-13.el7_2.5
   qemu-kvm-rhev-2.3.0-31.el7_2.21


How reproducible:

Not reproducible.


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

(Originally by Gordon Watson)
Comment 12 rhev-integ 2017-06-06 10:42:58 EDT
LSM implementation in engine changed drastically in 4.0 (from SEAT to CoCo infra). I couldn't reproduce the issue on latest build. Moving to ON_QA for verification.

(Originally by Daniel Erez)
Comment 13 rhev-integ 2017-06-06 10:43:04 EDT
Can you pls add steps to reproduce?

(Originally by Eyal Shenitzky)
Comment 14 rhev-integ 2017-06-06 10:43:10 EDT
(In reply to Eyal Shenitzky from comment #12)
> Can you pls add steps to reproduce?

There's actually no exact reproducing steps. It seems like a race that happens occasionally. If you didn't encounter this issue since 4.0, we can close the bug on insufficient_data and reopen if the issue is reproduced.

(Originally by Daniel Erez)
Comment 15 Eyal Shenitzky 2017-06-07 01:00:48 EDT
I didn't encounter that issue,
I Closing the bug and reopen in case it will be reproduced.

Close as insufficient_data as suggested in
 
https://bugzilla.redhat.com/show_bug.cgi?id=1443137#c13

Note You need to log in before you can comment on or make changes to this bug.