Bug 1260409 - Migration 7.2->7.1 failed since libvirtError in 'virDomainMigrateToURI2'
Migration 7.2->7.1 failed since libvirtError in 'virDomainMigrateToURI2'
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
3.5.4
Unspecified Unspecified
high Severity urgent
: ovirt-3.6.0-rc3
: 3.6.0
Assigned To: Francesco Romani
Israel Pinto
: AutomationBlocker, Regression
Depends On: 1265111
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-06 09:54 EDT by Israel Pinto
Modified: 2016-03-09 14:45 EST (History)
13 users (show)

See Also:
Fixed In Version: libvirt-1.2.17-11.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-03-09 14:45:03 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Virt
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
source_host_log (1.64 MB, application/zip)
2015-09-06 10:00 EDT, Israel Pinto
no flags Details
host_in_same_cluster_logs (1.52 MB, application/zip)
2015-09-06 10:02 EDT, Israel Pinto
no flags Details
engine_log (672.66 KB, application/zip)
2015-09-06 10:05 EDT, Israel Pinto
no flags Details

  None (edit)
Description Israel Pinto 2015-09-06 09:54:57 EDT
Description of problem:
Migration VM in automation testing,
In case of put host to maintenance with one VM.
 

Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Manager Version: 3.5.4.2-1.3.el6ev 
VDSM (RHEL 7.2): vdsm-4.16.26-1.el7ev

How reproducible:
All the time.


Steps to Reproduce:
1.Create VM,run VM
2.Put host to maintenance 
3.Check host and VM status

Actual results:
Host did not switch to maintenance

Expected results:
host switch to maintenance and VM up and running on second host


Additional info:

from vdsm log:
Thread-905::DEBUG::2015-09-04 01:21:01,846::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 27 edom: 20 level: 2 message: XML error: graphics listen attribute 10.35.160.55 must match address attribute of first listen element (found none)
Thread-905::DEBUG::2015-09-04 01:21:01,847::migration::386::vm.Vm::(cancel) vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::canceling migration downtime thread
Thread-906::DEBUG::2015-09-04 01:21:01,847::migration::383::vm.Vm::(run) vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::migration downtime thread exiting
Thread-905::DEBUG::2015-09-04 01:21:01,847::migration::480::vm.Vm::(stop) vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::stopping migration monitor thread
Thread-905::ERROR::2015-09-04 01:21:01,848::migration::161::vm.Vm::(_recover) vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::XML error: graphics listen attribute 10.35.160.55 must match address attribute of first listen element (found none)
Thread-905::ERROR::2015-09-04 01:21:02,170::migration::260::vm.Vm::(run) vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 246, in run
    self._startUnderlyingMigration(time.time())
  File "/usr/share/vdsm/virt/migration.py", line 335, in _startUnderlyingMigration
    None, maxBandwidth)
  File "/usr/share/vdsm/virt/vm.py", line 702, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1825, in migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)
libvirtError: XML error: graphics listen attribute 10.35.160.55 must match address attribute of first listen element (found none)
Comment 1 Israel Pinto 2015-09-06 10:00:04 EDT
Created attachment 1070746 [details]
source_host_log
Comment 2 Israel Pinto 2015-09-06 10:02:15 EDT
Created attachment 1070747 [details]
host_in_same_cluster_logs
Comment 3 Israel Pinto 2015-09-06 10:05:04 EDT
Created attachment 1070748 [details]
engine_log
Comment 4 Francesco Romani 2015-09-07 07:50:22 EDT
it seems this and https://bugzilla.redhat.com/show_bug.cgi?id=1260177 share the same root cause.
Comment 5 Francesco Romani 2015-09-07 10:46:23 EDT
not sure it is VDSM bug, but taking the bug to make sure to give enough bandwidth to it.
Comment 6 Francesco Romani 2015-09-07 16:13:04 EDT
(In reply to Israel Pinto from comment #0)
> Description of problem:
> Migration VM in automation testing,
> In case of put host to maintenance with one VM.
>  
> 
> Version-Release number of selected component (if applicable):
> Red Hat Enterprise Virtualization Manager Version: 3.5.4.2-1.3.el6ev 
> VDSM (RHEL 7.2): vdsm-4.16.26-1.el7ev
> 
> How reproducible:
> All the time.
> 
> 
> Steps to Reproduce:
> 1.Create VM,run VM
> 2.Put host to maintenance 
> 3.Check host and VM status
> 
> Actual results:
> Host did not switch to maintenance
> 
> Expected results:
> host switch to maintenance and VM up and running on second host
> 
> 
> Additional info:
> 
> from vdsm log:
> Thread-905::DEBUG::2015-09-04
> 01:21:01,846::libvirtconnection::143::root::(wrapper) Unknown libvirterror:
> ecode: 27 edom: 20 level: 2 message: XML error: graphics listen attribute
> 10.35.160.55 must match address attribute of first listen element (found
> none)
> Thread-905::DEBUG::2015-09-04 01:21:01,847::migration::386::vm.Vm::(cancel)
> vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::canceling migration downtime
> thread
> Thread-906::DEBUG::2015-09-04 01:21:01,847::migration::383::vm.Vm::(run)
> vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::migration downtime thread
> exiting
> Thread-905::DEBUG::2015-09-04 01:21:01,847::migration::480::vm.Vm::(stop)
> vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::stopping migration monitor
> thread
> Thread-905::ERROR::2015-09-04
> 01:21:01,848::migration::161::vm.Vm::(_recover)
> vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::XML error: graphics listen
> attribute 10.35.160.55 must match address attribute of first listen element
> (found none)
> Thread-905::ERROR::2015-09-04 01:21:02,170::migration::260::vm.Vm::(run)
> vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::Failed to migrate
> Traceback (most recent call last):
>   File "/usr/share/vdsm/virt/migration.py", line 246, in run
>     self._startUnderlyingMigration(time.time())
>   File "/usr/share/vdsm/virt/migration.py", line 335, in
> _startUnderlyingMigration
>     None, maxBandwidth)
>   File "/usr/share/vdsm/virt/vm.py", line 702, in f
>     ret = attr(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
> 111, in wrapper
>     ret = f(*args, **kwargs)
>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1825, in
> migrateToURI2
>     if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed',
> dom=self)
> libvirtError: XML error: graphics listen attribute 10.35.160.55 must match
> address attribute of first listen element (found none)


OK, so both the source and destination host are running VDSM (RHEL 7.2): vdsm-4.16.26-1.el7ev, right?

I suppose libvirt is 1.2.17-5.el7 on both hosts, can you confirm?

Do the plain, user-triggered migration between the two host work?
Comment 7 Israel Pinto 2015-09-08 01:44:12 EDT
(In reply to Francesco Romani from comment #6)
> (In reply to Israel Pinto from comment #0)
> > Description of problem:
> > Migration VM in automation testing,
> > In case of put host to maintenance with one VM.
> >  
> > 
> > Version-Release number of selected component (if applicable):
> > Red Hat Enterprise Virtualization Manager Version: 3.5.4.2-1.3.el6ev 
> > VDSM (RHEL 7.2): vdsm-4.16.26-1.el7ev
> > 
> > How reproducible:
> > All the time.
> > 
> > 
> > Steps to Reproduce:
> > 1.Create VM,run VM
> > 2.Put host to maintenance 
> > 3.Check host and VM status
> > 
> > Actual results:
> > Host did not switch to maintenance
> > 
> > Expected results:
> > host switch to maintenance and VM up and running on second host
> > 
> > 
> > Additional info:
> > 
> > from vdsm log:
> > Thread-905::DEBUG::2015-09-04
> > 01:21:01,846::libvirtconnection::143::root::(wrapper) Unknown libvirterror:
> > ecode: 27 edom: 20 level: 2 message: XML error: graphics listen attribute
> > 10.35.160.55 must match address attribute of first listen element (found
> > none)
> > Thread-905::DEBUG::2015-09-04 01:21:01,847::migration::386::vm.Vm::(cancel)
> > vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::canceling migration downtime
> > thread
> > Thread-906::DEBUG::2015-09-04 01:21:01,847::migration::383::vm.Vm::(run)
> > vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::migration downtime thread
> > exiting
> > Thread-905::DEBUG::2015-09-04 01:21:01,847::migration::480::vm.Vm::(stop)
> > vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::stopping migration monitor
> > thread
> > Thread-905::ERROR::2015-09-04
> > 01:21:01,848::migration::161::vm.Vm::(_recover)
> > vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::XML error: graphics listen
> > attribute 10.35.160.55 must match address attribute of first listen element
> > (found none)
> > Thread-905::ERROR::2015-09-04 01:21:02,170::migration::260::vm.Vm::(run)
> > vmId=`96aa6da9-b0c0-4047-a001-7df31f288f91`::Failed to migrate
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/virt/migration.py", line 246, in run
> >     self._startUnderlyingMigration(time.time())
> >   File "/usr/share/vdsm/virt/migration.py", line 335, in
> > _startUnderlyingMigration
> >     None, maxBandwidth)
> >   File "/usr/share/vdsm/virt/vm.py", line 702, in f
> >     ret = attr(*args, **kwargs)
> >   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line
> > 111, in wrapper
> >     ret = f(*args, **kwargs)
> >   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1825, in
> > migrateToURI2
> >     if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed',
> > dom=self)
> > libvirtError: XML error: graphics listen attribute 10.35.160.55 must match
> > address attribute of first listen element (found none)
> 
> 
> OK, so both the source and destination host are running VDSM (RHEL 7.2):
> vdsm-4.16.26-1.el7ev, right?
> 
> I suppose libvirt is 1.2.17-5.el7 on both hosts, can you confirm?
> 
> Do the plain, user-triggered migration between the twsto host work?

1. Both hosts are rhel 7.2 with vdsm-4.16.26-1.el7ev
2. I tested (manually):
    user triggered migration - it works fine
    maintenance with one VM - works fine
But it automation those cases failed, not only in virt testing.
Comment 8 Michal Skrivanek 2015-09-08 09:33:47 EDT
(In reply to Israel Pinto from comment #7)
> > 
> > OK, so both the source and destination host are running VDSM (RHEL 7.2):
> > vdsm-4.16.26-1.el7ev, right?
> > 
> > I suppose libvirt is 1.2.17-5.el7 on both hosts, can you confirm?
> > 
> > Do the plain, user-triggered migration between the twsto host work?
> 
> 1. Both hosts are rhel 7.2 with vdsm-4.16.26-1.el7ev
> 2. I tested (manually):
>     user triggered migration - it works fine
>     maintenance with one VM - works fine
> But it automation those cases failed, not only in virt testing.

Israel, please answer the question. Please confirm libvirt version, on those automation machines
Comment 9 Francesco Romani 2015-09-08 10:54:46 EDT
I fully understand the urgency and I'm working to reproduce the issue and find the root cause.

However, what I'd like to ask is:
According from https://bugzilla.redhat.com/show_bug.cgi?id=1260409#c0 
- automatic migration with just one VM fails
- it always fails

But on VDSM, all the flows uses the same verb, so this mean that every migration must fail, whatever the reason it is triggered, automatic or as per user request.

Is migration completely broken, then? I surely can't reproduce this.
Comment 10 Israel Pinto 2015-09-08 11:17:19 EDT
libvirt version:
libvirt-1.2.17-6.el7
Comment 11 Francesco Romani 2015-09-08 12:15:36 EDT
(In reply to Israel Pinto from comment #7)

> > I suppose libvirt is 1.2.17-5.el7 on both hosts, can you confirm?
> > 
> > Do the plain, user-triggered migration between the twsto host work?
> 
> 1. Both hosts are rhel 7.2 with vdsm-4.16.26-1.el7ev
> 2. I tested (manually):
>     user triggered migration - it works fine
>     maintenance with one VM - works fine
> But it automation those cases failed, not only in virt testing.

OK, so I need to see full automation logs. If both maintenance and manual migration works, it's something else - and much less critical.

Please point me to any recent automation failure, with full logs (engine, source, destination). A link to jenkins job is fine.
Comment 14 Yaniv Kaul 2015-09-16 06:52:24 EDT
Has anyone looked at the specific issue libvirt is complaining about: 
XML error: graphics listen attribute 10.35.160.55 must match address attribute of first listen element (found none) 

?
Comment 15 Francesco Romani 2015-09-16 07:01:10 EDT
(In reply to Yaniv Kaul from comment #14)
> Has anyone looked at the specific issue libvirt is complaining about: 
> XML error: graphics listen attribute 10.35.160.55 must match address
> attribute of first listen element (found none) 
> 
> ?

Yes. We already seen this error, but not in this context.
This is indeed worrysome, if confirmed, so I tried to reproduce it few times, without luck. Please note that in the very same environment the issue does not happen 100% of times.
Comment 16 Michal Skrivanek 2015-09-17 05:35:36 EDT
I see few more runs now...so it seems we have:
#92 - fails 
#93 - fails
#94 - fixed
#95 bogus - env error
#96 network and storage failures, this particular test was skipped
#97 bogus - env error

so..is it still failing actually?
Comment 17 Yaniv Kaul 2015-09-20 10:45:07 EDT
Any chance it's related to bug 1261007 ?
Comment 18 Ilanit Stein 2015-09-22 04:42:29 EDT
Error, same as in this bug description, seen on rhevm-3.5.4.2-1.3.el6ev,
for VM migration 

From:
-RHEL 7.2:
vdsm-4.16.27-1.el7ev.x86_64
libvirt-1.2.17-9.el7.x86_64

To:
 -RHEV-H 7.1 for RHEV 3.5.4-1 ASYNC (rhev-hypervisor7-7.1-20150911.0):
vdsm-4.16.26-1.el7ev.x86_64
libvirt-1.2.8-16.el7_1.3.x86_64
Comment 19 Ilanit Stein 2015-09-22 04:44:12 EDT
Removing need info from ipinto, following the Depends On: 1265111, by fromani.
Comment 20 Michal Skrivanek 2015-09-22 06:42:23 EDT
(In reply to Ilanit Stein from comment #18)
> Error, same as in this bug description, seen on rhevm-3.5.4.2-1.3.el6ev,
> for VM migration 
> 
> From:
> -RHEL 7.2:
> vdsm-4.16.27-1.el7ev.x86_64
> libvirt-1.2.17-9.el7.x86_64
> 
> To:
>  -RHEV-H 7.1 for RHEV 3.5.4-1 ASYNC (rhev-hypervisor7-7.1-20150911.0):
> vdsm-4.16.26-1.el7ev.x86_64
> libvirt-1.2.8-16.el7_1.3.x86_64

that is a great finding and helped identifying a real compatibility issue, but we need a confirmation this is what was going on in these automated tests here as well
Comment 21 Michal Skrivanek 2015-09-22 06:50:29 EDT
#98 user aborted
#99 bogus  - env error
#100 - 18 other errors (storage and sla), the one in question worked ok
#101 - end abruptly, test skipped
#102 - end abruptly, test skipped

We have 2 successes and no failures recently, I suggest to close the bug and get environment stable. And focus on 1265111 instead
Comment 22 Michal Skrivanek 2015-09-23 08:53:55 EDT
(In reply to Michal Skrivanek from comment #21)
> #98 user aborted
> #99 bogus  - env error
> #100 - 18 other errors (storage and sla), the one in question worked ok
> #101 - end abruptly, test skipped
> #102 - end abruptly, test skipped
> 
> We have 2 successes and no failures recently, I suggest to close the bug and
> get environment stable. And focus on 1265111 instead

finally, #103 fails clearly again, same error
Comment 23 Michal Skrivanek 2015-09-23 09:28:40 EDT
so, finally digged into all the jenkins logs and contrary to what the jenkins page says the migration actually happened from 7.1 -> 7.2 -> 7.1, the last one failed. That is consistent with the manual testing findings (e.g. in https://bugzilla.redhat.com/show_bug.cgi?id=1265111#c5)
Comment 24 Michal Skrivanek 2015-09-24 03:53:26 EDT
keeping open for .spec bump up
Comment 25 Michal Skrivanek 2015-09-25 06:33:40 EDT
after all the libvirt change will be in 7.2 GA hence no need for spec bump

Note you need libvirt-1.2.17-11.el7 to test this, regardless RHEV/oVirt version
Comment 26 Fangge Jin 2015-09-28 07:14:38 EDT
I can reproduce this bug with build:
rhel7.2: libvirt-1.2.17-10.el7.x86_64
rhel7.1: libvirt-1.2.8-16.el7_1.3.x86_64

Steps:
1.Register rhel7.1 host and rhel7.2 host to rhevm
2.Create a guest on rhel7.2 host via rhevm
3.Migrate the guest to rhel7.1 host, migration failed with the error in vdsm.log:

Thread-336::ERROR::2015-09-28 18:59:55,284::migration::161::vm.Vm::(_recover) vmId=`6cf1a976-bd6f-4208-9793-fd0cb2b90188`::XML error: graphics listen attribute 10.66.106.26 must match address attribute of first listen element (found none)
Thread-336::ERROR::2015-09-28 18:59:55,310::migration::260::vm.Vm::(run) vmId=`6cf1a976-bd6f-4208-9793-fd0cb2b90188`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 246, in run
    self._startUnderlyingMigration(time.time())
  File "/usr/share/vdsm/virt/migration.py", line 325, in _startUnderlyingMigration
    None, maxBandwidth)
  File "/usr/share/vdsm/virt/vm.py", line 689, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1701, in migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)
libvirtError: XML error: graphics listen attribute 10.66.106.26 must match address attribute of first listen element (found none)
Comment 27 Fangge Jin 2015-09-28 07:16:16 EDT
Verify pass on build libvirt-1.2.17-11.el7.x86_64

Steps are same as comment 26, migration succeed.
Comment 29 Francesco Romani 2016-01-19 10:27:18 EST
libvirt issue, no need to mention this in RHEV docs.
Comment 31 errata-xmlrpc 2016-03-09 14:45:03 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0362.html

Note You need to log in before you can comment on or make changes to this bug.