Bug 1459216 - [downstream clone - 4.1.3] Live Storage Migration sequence did not complete, SyncImage task failed on the SPM
Summary: [downstream clone - 4.1.3] Live Storage Migration sequence did not complete, ...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.6.9
Hardware: Unspecified
OS: Linux
high
high
Target Milestone: ovirt-4.1.3
: ---
Assignee: Daniel Erez
QA Contact: Eyal Shenitzky
URL:
Whiteboard:
Depends On: 1443137
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-06 14:41 UTC by rhev-integ
Modified: 2021-05-01 16:48 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1443137
Environment:
Last Closed: 2017-06-07 05:00:48 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3005271 0 None None None 2017-06-06 14:43:24 UTC

Description rhev-integ 2017-06-06 14:41:44 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1443137 +++
======================================================================

Description of problem:

A Live Storage Migration 'hung'. The task handling the internal (base) volume copy on the SPM encountered an error, yet the 'copy' (qemu-img convert) appeared to have completed successfully. A subsequent "UnknownTask" error occurred while trying to stop/clear the task. 

On the engine, the SyncImageGroupDataVDSCommand completed without 'error', yet the LSM sequence just stopped, and so the VmReplicateDiskFinishVDSCommand was never executed. 

As a result, the symptoms were that the base volume copy seemed to have completed, but the active volume 'block copy' job was still in running.



Version-Release number of selected component (if applicable):

RHEV 3.6.9
RHEL 7.2 host;
   vdsm-4.17.35-1.el7
   libvirt-1.2.17-13.el7_2.5
   qemu-kvm-rhev-2.3.0-31.el7_2.21


How reproducible:

Not reproducible.


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

(Originally by Gordon Watson)

Comment 12 rhev-integ 2017-06-06 14:42:58 UTC
LSM implementation in engine changed drastically in 4.0 (from SEAT to CoCo infra). I couldn't reproduce the issue on latest build. Moving to ON_QA for verification.

(Originally by Daniel Erez)

Comment 13 rhev-integ 2017-06-06 14:43:04 UTC
Can you pls add steps to reproduce?

(Originally by Eyal Shenitzky)

Comment 14 rhev-integ 2017-06-06 14:43:10 UTC
(In reply to Eyal Shenitzky from comment #12)
> Can you pls add steps to reproduce?

There's actually no exact reproducing steps. It seems like a race that happens occasionally. If you didn't encounter this issue since 4.0, we can close the bug on insufficient_data and reopen if the issue is reproduced.

(Originally by Daniel Erez)

Comment 15 Eyal Shenitzky 2017-06-07 05:00:48 UTC
I didn't encounter that issue,
I Closing the bug and reopen in case it will be reproduced.

Close as insufficient_data as suggested in
 
https://bugzilla.redhat.com/show_bug.cgi?id=1443137#c13


Note You need to log in before you can comment on or make changes to this bug.