Bug 1312719 - migration from f22 to f23 fails to complete
migration from f22 to f23 fails to complete
Status: CLOSED UPSTREAM
Product: Fedora
Classification: Fedora
Component: libvirt (Show other bugs)
23
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Libvirt Maintainers
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-28 22:47 EST by Jason Tibbitts
Modified: 2016-04-21 10:19 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-04-21 10:19:48 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jason Tibbitts 2016-02-28 22:47:41 EST
I have an existing F22 machine hosting some VMs, and a new F23 machine I've brought up to which I'd like to move some VMs.  The F22 machine is running libvirt-daemon-1.2.13.2-2.fc22.x86_64; the kernel is currently kernel-4.2.8-200.fc22.x86_64 as I don't really want to reboot it.  The F23 machine shoudl be running the latest:   libvirt-daemon-1.2.18.2-2.fc23.x86_64 and kernel-4.4.2-301.fc23.x86_64.

The machines do _not_ have shared storage.  I use LVM pools for storage, and have create a storage volume on the destination host which is sized and named identically to the one on the source host.  (This is what I've done for a few releases and it's worked fine in the past.)

I start a migration on the F22 machine:

[root@vs04 ~]# virsh migrate --live --persistent --copy-storage-all --compressed --verbose --desturi qemu+ssh://root@vs05.math.uh.edu/system jlt2

Network traffic pegs at about 350mbps for about 15 minutes as the 40GB image is copied to the destination machine.

Then the destination machine logs:

Feb 28 21:32:04 vs04.math.uh.edu libvirtd[23572]: Cannot start job (modify, none) for domain jlt2; current job is (none, migration out) owned by (0, 23577)
Feb 28 21:32:04 vs04.math.uh.edu libvirtd[23572]: Timed out during operation: cannot acquire state change lock

If I ctrl-C after a while the source machine logs:

Feb 28 21:36:58 vs04.math.uh.edu libvirtd[23572]: operation aborted: migration out: canceled by client

and the destination logs:

Feb 28 21:36:58 vs05.math.uh.edu libvirtd[1106]: internal error: info migration reply was missing return status

However, in the past I've let it sit for a couple of hours and saw the following bits in the log on the destination:

Feb 28 18:40:51 vs05.math.uh.edu libvirtd[1106]: Cannot start job (query, none) for domain jlt2; current job is (none, migration in) owned by (0 <null>, 0 remoteDispatchDomainMigratePrepare3Params) for (0s, 6344s)
Feb 28 18:40:51 vs05.math.uh.edu libvirtd[1106]: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePrepare3Params)
Feb 28 18:41:21 vs05.math.uh.edu libvirtd[1106]: Cannot start job (query, none) for domain jlt2; current job is (none, migration in) owned by (0 <null>, 0 remoteDispatchDomainMigratePrepare3Params) for (0s, 6374s)
Feb 28 18:41:21 vs05.math.uh.edu libvirtd[1106]: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePrepare3Params)
Feb 28 18:41:51 vs05.math.uh.edu libvirtd[1106]: Cannot start job (query, none) for domain jlt2; current job is (none, migration in) owned by (0 <null>, 0 remoteDispatchDomainMigratePrepare3Params) for (0s, 6404s)
Feb 28 18:41:51 vs05.math.uh.edu libvirtd[1106]: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePrepare3Params)
Feb 28 18:42:11 vs05.math.uh.edu libvirtd[1106]: internal error: info migration reply was missing return status

I'm not sure if I'm doing something wrong, as this procedure did work fine to migrate from F21 to F22.
Comment 1 Cole Robinson 2016-03-04 16:22:38 EST
Jiri does anything above ring a bell? Maybe a bug fix we need to backport to f23?
Comment 2 Cole Robinson 2016-03-16 16:53:25 EDT
Jason if you're just looking to get it working, you can look into grabbing newer libvirt versions for source and dest from the virt-preview repo:

https://fedoraproject.org/wiki/Virtualization_Preview_Repository
Comment 3 Jason Tibbitts 2016-03-16 17:37:58 EDT
I guess my main question is whether the issue is on the source (F22) or the destination (F23) machine.  I can play with the F23 machine since I haven't been able to migrate anything to it yet so I'll pull packages from that repo (or just build what's in rawhide) and see if it helps.
Comment 4 Cole Robinson 2016-03-16 17:41:48 EDT
with migration it really could be either source or destination libvirtd... the process is complicated enough that either side wouldn't surprise me
Comment 5 Jason Tibbitts 2016-04-20 23:45:08 EDT
Finally managed to get back around to this.  Updating just the F23 machine (the migration target) to what's currently in the preview repository has no effect.

So I went ahead, rolled the dice, and updated the F22 (migration source) machine as well.  And.... it worked. (!)

So, that's some good news.  I'm kind of worried about rebooting the F22 machine after upgrading everything but at least now I can migrate things from it.

I guess this could be closed now if you'd like.
Comment 6 Cole Robinson 2016-04-21 10:19:48 EDT
Thanks for updating, let's close it

Note You need to log in before you can comment on or make changes to this bug.