Bug 878975 - [TEXT] vdsm: A successful Live storage migration is logged as a failed merge in vdsm
Summary: [TEXT] vdsm: A successful Live storage migration is logged as a failed merge...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: ---
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.0.0-alpha
: 4.17.24
Assignee: Adam Litke
QA Contact: Raz Tamir
URL:
Whiteboard:
: 998280 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-21 17:15 UTC by Dafna Ron
Modified: 2016-07-05 07:46 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-07-05 07:46:35 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.0.0+
rule-engine: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
log (577.24 KB, application/x-xz)
2012-11-21 17:15 UTC, Dafna Ron
no flags Details
## Logs rhevm, vdsm, libvirt (1.12 MB, application/x-gzip)
2013-06-13 12:20 UTC, vvyazmin@redhat.com
no flags Details
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm (1.90 MB, application/x-gzip)
2013-10-20 10:31 UTC, vvyazmin@redhat.com
no flags Details

Description Dafna Ron 2012-11-21 17:15:39 UTC
Created attachment 649314 [details]
log

Description of problem:

after live storage migration we see in vdsm log the following error: 

ibvirtEventLoop::ERROR::2012-11-21 18:59:21,020::libvirtvm::1986::vm.Vm::(_onBlockJobEvent) vmId=`29d84c1b-666e-4554-8e13-8fccee67d28d`::Live merge completed for an unexpected path: /rhev/data-center/edf0ee04-0cc2-4e13-877d-1e89541aea55/a5f10bab-bd9d-4834-b1d9-b29d0ec887dc/images/d1ba6b3d-3c2b-4e39-b454-4f88b8b8bef6/79c0ab39-1a48-4324-a4fa-ca1c4e3b5c47

this is an event logged by libvirt which vdsm does not trap. 

since this is not actually an ERROR it would help debugging if we remove the event from the log or change the ERROR to different log level entry. 

Version-Release number of selected component (if applicable):

vdsm-4.9.6-44.0.el6_3.x86_64

How reproducible:

100%

Steps to Reproduce:
1. run live storage migration for several vms
2.
3.
  
Actual results:

we log a libvirt event as ERROR 

Expected results:

this even should not be logged as ERROR in vdsm log. 

Additional info: full log

Comment 1 Vered Volansky 2013-01-10 13:37:37 UTC
The message shouldn't be propagated at all.

Comment 2 Federico Simoncelli 2013-01-10 15:26:33 UTC
The issue here is that blockJobAbort (at the end of the live storage migration) generates an event that is similar to the one generated at the end of a live merge. This means that vdsm should try and identify why the event was received (whether it is for live storage migration or live merge). The pseudo code would be something like:

 def eventCallback(...):
     if (the event is related to a live storage migration):
         log...
     else if (the event is related to a live merge):
         (the current logic)
     else:
         log the error

Comment 3 Vered Volansky 2013-01-14 15:54:05 UTC
After talking to Dafna-
This should be reproduced by using a 3.2 DC with a 6.4 host.
This is storage migration (disks) and is not related to vm migration.

Comment 5 vvyazmin@redhat.com 2013-06-13 11:52:44 UTC
Same error I get in scale environment:

Version-Release number of selected component (if applicable):
RHEVM 3.2 - SF17.5 environment: 

RHEVM: rhevm-3.2.0-11.30.el6ev.noarch 
VDSM: vdsm-4.10.2-22.0.el6ev.x86_64 
LIBVIRT: libvirt-0.10.2-18.el6_4.5.x86_64 
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.355.el6_4.5.x86_64 
SANLOCK: sanlock-2.6-2.el6.x86_64
PythonSDK: rhevm-sdk-3.2.0.11-1.el6ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create FCP Data-center
2. Create VM with 25 disks with 'VirtIO' interface
3. Select and move all 25 VM's disk from DC-01 to DC-02

Logs attached.

Comment 6 vvyazmin@redhat.com 2013-06-13 12:20:44 UTC
Created attachment 760650 [details]
## Logs rhevm, vdsm, libvirt

Comment 7 Ayal Baron 2013-07-07 08:14:58 UTC
Fede, is this vdsm related or libvirt?

Comment 8 Federico Simoncelli 2013-09-02 20:54:47 UTC
*** Bug 998280 has been marked as a duplicate of this bug. ***

Comment 9 vvyazmin@redhat.com 2013-10-20 10:31:19 UTC
Same error reproduced in  RHEVM 3.3 - IS18 environment:

Host OS: RHEL 6.5

RHEVM:  rhevm-3.3.0-0.25.beta1.el6ev.noarch
PythonSDK:  rhevm-sdk-python-3.3.0.15-1.el6ev.noarch
VDSM:  vdsm-4.13.0-0.2.beta1.el6ev.x86_64
LIBVIRT:  libvirt-0.10.2-27.el6.x86_64
QEMU & KVM:  qemu-kvm-rhev-0.12.1.2-2.412.el6.x86_64
SANLOCK:  sanlock-2.8-1.el6.x86_64

VDSM Log:
libvirtEventLoop::DEBUG::2013-10-20 11:21:18,107::vm::4792::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`ce280769-8b99-4810-8f0f-29757ad3abc2`::event Suspended detail 0 opaque None
libvirtEventLoop::ERROR::2013-10-20 11:21:18,128::vm::3837::vm.Vm::(_onBlockJobEvent) vmId=`ce280769-8b99-4810-8f0f-29757ad3abc2`::Live merge completed for an unexpected path: /rhev/data-center/mnt/blockSD/fc786f96-81c2-488f-a5a9-1b2c0a4a0aa2/images/670d77f5-1bf5-4c2d-8b73-8f0a56a6f97a/f9f97f91-17b2-48f4-8f6a-50903ab0ff17
libvirtEventLoop::DEBUG::2013-10-20 11:21:18,184::vm::4792::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`ce280769-8b99-4810-8f0f-29757ad3abc2`::event Resumed detail 0 opaque None
Thread-1872::DEBUG::2013-10-20 11:21:18,185::task::579::TaskManager.Task::(_updateState) Task=`6bd6814f-197c-4ace-93a3-c04a1e5864e3`::moving from state init -> state preparing
Thread-1872::INFO::2013-10-20 11:21:18,185::logUtils::44::dispatcher::(wrapper) Run and protect: teardownImage(sdUUID='561ce535-9830-49a3-975b-ac5fa2915cce', spUUID='1e157529-0537-4696-8a6d-9b4be4680e44', imgUUID='670d77f5-1bf5-4c2d-8b73-8f0a56a6f97a', volUUID=None)


Logs attached

Comment 10 vvyazmin@redhat.com 2013-10-20 10:31:57 UTC
Created attachment 814178 [details]
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm

Comment 12 Allon Mureinik 2015-03-31 09:10:01 UTC
Adam, with all the changes around how block jobs are handled in 3.5, is this still relevant?

Comment 13 Adam Litke 2015-04-01 17:55:05 UTC
It's changed slightly.  We don't handle block job events at all so any block job event that comes through (be it from a live merge or LSM) will trigger a warning in the log.  If we want to make any changes at all, we could trap the block job event and print it as a debug message.

Comment 14 Allon Mureinik 2015-04-02 05:52:30 UTC
(In reply to Adam Litke from comment #13)
> It's changed slightly.  We don't handle block job events at all so any block
> job event that comes through (be it from a live merge or LSM) will trigger a
> warning in the log.  If we want to make any changes at all, we could trap
> the block job event and print it as a debug message.

We probably should, but not as a high priority.
We can attempt to handle this post 3.6.0's feature freeze.

Comment 15 Sandro Bonazzola 2015-10-26 12:36:33 UTC
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015.
Please review this bug and if not a blocker, please postpone to a later release.
All bugs not postponed on GA release will be automatically re-targeted to

- 3.6.1 if severity >= high
- 4.0 if severity < high

Comment 16 Yaniv Kaul 2016-03-10 10:11:20 UTC
(In reply to Allon Mureinik from comment #14)
> (In reply to Adam Litke from comment #13)
> > It's changed slightly.  We don't handle block job events at all so any block
> > job event that comes through (be it from a live merge or LSM) will trigger a
> > warning in the log.  If we want to make any changes at all, we could trap
> > the block job event and print it as a debug message.
> 
> We probably should, but not as a high priority.
> We can attempt to handle this post 3.6.0's feature freeze.

Will we? Apart from being confusing, I'd close this bug as WONTFIX.

Comment 17 Allon Mureinik 2016-03-10 10:34:42 UTC
(In reply to Yaniv Kaul from comment #16)
> (In reply to Allon Mureinik from comment #14)
> > (In reply to Adam Litke from comment #13)
> > > It's changed slightly.  We don't handle block job events at all so any block
> > > job event that comes through (be it from a live merge or LSM) will trigger a
> > > warning in the log.  If we want to make any changes at all, we could trap
> > > the block job event and print it as a debug message.
> > 
> > We probably should, but not as a high priority.
> > We can attempt to handle this post 3.6.0's feature freeze.
> 
> Will we? Apart from being confusing, I'd close this bug as WONTFIX.
Worth leaving open to re-examine during the SDM work.

Comment 18 Allon Mureinik 2016-03-28 08:57:32 UTC
Can no longer reproduce with 4.0's codebase, moving to ON_QA to verification.

Comment 19 Raz Tamir 2016-04-10 16:01:21 UTC
Verified on 
ovirt-engine-4.0.0-0.0.master.20160406161747.gita4ecba2.el7.centos.noarch
No error message in vdsm

Comment 20 Sandro Bonazzola 2016-07-05 07:46:35 UTC
oVirt 4.0.0 has been released, closing current release.


Note You need to log in before you can comment on or make changes to this bug.