Bug 1029138 - Migration from host wit high cpu to low cpu failed with libvirt error
Migration from host wit high cpu to low cpu failed with libvirt error
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
3.3.0
Unspecified Unspecified
unspecified Severity unspecified
: ---
: 3.3.0
Assigned To: nobody nobody
Artyom
virt
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-11 13:21 EST by Artyom
Modified: 2014-01-01 03:46 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-21 11:14:50 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
agent and vdsm logs (21.42 KB, application/zip)
2013-11-11 13:21 EST, Artyom
no flags Details
logs (85.62 KB, application/zip)
2013-11-14 08:29 EST, Artyom
no flags Details

  None (edit)
Description Artyom 2013-11-11 13:21:18 EST
Created attachment 822537 [details]
agent and vdsm logs

Description of problem:
Migration from host wit high cpu to low cpu failed with libvirt error.

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-ha-0.1.0-0.5.1.beta1.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. install hosted-engine
2. run hosted-engine --deploy on first host and finish process
3. run hosted-engine --deploy on second host with same path to storage to add host to HA rhevm
4. Try to migrate from host with high_cpu to host with low_cpu

Actual results:
Migration failed with libvirt error:
libvirtError: operation failed: migration job: unexpectedly failed

Expected results:
Vm migration success

Additional info:
cluster cpu level all time on intel conroe family

host with low_cpu - Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
host with high cpu - Intel(R) Xeon(R) CPU           E5649  @ 2.53GHz

Migration from low to high works fine
Comment 1 Greg Padgett 2013-11-11 13:53:09 EST
Note, also see bug 1015721 for the scenario from which this originated.
Comment 2 Artyom 2013-11-14 08:29:17 EST
Created attachment 823957 [details]
logs

Added new logs include libvirt log
Comment 3 Artyom 2013-11-14 11:23:04 EST
Also checked on regular rhevm is23, the same problem exist:
define: problematic_host - migration on this host failed
        host_1
        host_2
migration from problematic_host to host_1 or host_2 work fine
migration from host_1 to host_2 or from host_2 to host_1 work fine
migration from host_1 or host_2 to problematic_host, failed

on all three hosts vdsm, libvirt and selinux versions the same
Comment 4 Artyom 2013-11-14 12:00:44 EST
After the investigation the problem seems to be more general, and it also appear in rhevm and not just in hosted engine

About reproducing, I was success to reproduce this bug just for migration to host cyan-vdsf.qa.lab.tlv.redhat.com, so if will need to do some extra investigation on this host, just tell me.
Comment 5 Michal Skrivanek 2013-11-19 06:40:27 EST
this looks exactly as bug 1013617. Would you please check the details there? It's VERIFIED
Comment 6 Artyom 2013-11-21 08:37:13 EST
It's look as the same bug, but I have a new version of vdsm on all hosts,
also after little investigation I found that problematic host was have some additional vdsm packets:
vdsm-tests-4.13.0-0.9.beta1.el6ev.noarch
vdsm-hook-qemucmdline-4.13.0-0.9.beta1.el6ev.noarch
vdsm-hook-vhostmd-4.13.0-0.9.beta1.el6ev.noarch
vdsm-api-4.13.0-0.9.beta1.el6ev.noarch
vdsm-debug-plugin-4.13.0-0.9.beta1.el6ev.noarch
vdsm-debuginfo-4.13.0-0.9.beta1.el6ev.x86_64
vdsm-gluster-4.13.0-0.9.beta1.el6ev.noarch
vdsm-reg-4.13.0-0.9.beta1.el6ev.noarch

So after: 
yum erase vdsm*
yum install vdsm

Migration started to work, so one of packages was responsible for this error.
So now I am not sure if it bug or just some test package.
Comment 7 Michal Skrivanek 2013-11-21 10:23:42 EST
might be good to see which one is it…but it also might have been something messed up with environment which was corrected by vdsm reinstall (like restorecon)

If you don't want to invest time into nailing this down I'd just closed the bug….
Comment 8 Artyom 2013-11-21 10:37:12 EST
ok close bug, if I will encounter the same problem in future, will investigate more deeply
Thanks

Note You need to log in before you can comment on or make changes to this bug.