Bug 1376083
| Summary: | Spice migration can't finish when migrate a guest with spice listen on port '0' from RHEL7.2 to RHEL7.3 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Fangge Jin <fjin> | ||||||||
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Fangge Jin <fjin> | ||||||||
| Severity: | medium | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 7.3 | CC: | dyuan, fjin, jdenemar, mzhan, rbalakri, xuzhang, yafu, zpeng | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | libvirt-2.5.0-1.el7 | Doc Type: | If docs needed, set a value | ||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2017-06-15 09:07:45 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
Update spice-server on RHEL7.3 host to spice-server-0.12.4-19.el7.x86_64, and test again, the problem still exists Created attachment 1200914 [details]
log and domain xml
1) 7.3 -> 7.3 works because libvirtd is not (correctly) waiting for the SPICE_MIGRATE_COMPLETED event 2) in the 7.2 -> 7.2 case libvirtd is waiting for the event and QEMU sends it 3) 7.3 -> 7.2 is broken since libvirtd is waiting for the event, but QEMU doesn't send it I wasn't able to reproduce the third case; QEMU was sending the event. I'm not sure why it didn't work for you, but we should fix libvirt to behave similarly to 1. That is, libvirt should not even try to do seamless SPICE migration. Fixed upstream by
commit 2e164b451ee6a33e2d9ba33d7bf2121f3f4817a0
Refs: v2.2.0-141-g2e164b4
Author: Jiri Denemark <jdenemar>
AuthorDate: Tue Sep 20 17:27:03 2016 +0200
Commit: Jiri Denemark <jdenemar>
CommitDate: Wed Sep 21 14:11:26 2016 +0200
qemu: Ignore graphics cookie if port == 0
Old libvirt represents
<graphics type='spice'>
<listen type='none'/>
</graphics>
as
<graphics type='spice' autoport='no'/>
In this mode, QEMU doesn't listen for SPICE connection anywhere and
clients have to use virDomainOpenGraphics* APIs to attach to the domain.
That is, the client has to run on the same host where the domains runs
and it's impossible to tell the client to reconnect to the destination
QEMU during migration (unless there is some kind of proxy on the host).
While current libvirt correctly ignores such graphics devices when
creating graphics migration cookie, old libvirt just sends
<graphics type='spice' port='0' listen='0.0.0.0' tlsPort='-1'/>
in the cookie. After seeing this cookie, we happily would call
client_migrate_info QMP command and wait for SPICE_MIGRATE_COMPLETED
event, which is quite pointless since the doesn't know where to connecti
anyway. We should just ignore such cookies.
https://bugzilla.redhat.com/show_bug.cgi?id=1376083
Signed-off-by: Jiri Denemark <jdenemar>
Hi Jirka I tried to verify this bug with the following builds, but the issue still exists: Source host: libvirt-1.2.17-13.el7_2.6.x86_64 qemu-kvm-rhev-2.3.0-31.el7_2.25.x86_64 spice-server-0.12.4-15.el7_2.2.x86_64 Target host: libvirt-3.2.0-6.virtcov.el7.x86_64 qemu-kvm-rhev-2.9.0-6.el7.x86_64 spice-server-0.12.8-2.el7.x86_64 Well, the fix was on the source and you're still using an old unfixed version there. Thus this is definitely not something the patch from comment 5 could fix. However, from its commit message I'd say the new libvirt on the target host should not even send a graphics cookie, but apparently it did. Could you be so kind and share debug logs from both sides with us? Created attachment 1283695 [details] libvirtd log for comment 7 Hmm, this seems to be a different issue according to logs. There's no graphics cookie sent by the target libvirtd and there's no message in the log which would indicate the source is waiting for spice migration to finish. It seems the migration just hangs somewhere in qemuMigrationConfirmPhase just after parsing the migration cookie. Could you please attach gdb to the source daemon once it hangs and attach a complete backtrace of it? Created attachment 1283714 [details]
backtrace of libvirtd
(In reply to Jiri Denemark from comment #10) > Hmm, this seems to be a different issue according to logs. There's no > graphics cookie sent by the target libvirtd and there's no message in the > log which would indicate the source is waiting for spice migration to > finish. It seems the migration just hangs somewhere in > qemuMigrationConfirmPhase just after parsing the migration cookie. Could you > please attach gdb to the source daemon once it hangs and attach a complete > backtrace of it? Maybe there is a misunderstanding between us? I'm always testing vm migration from RHEL 7.2 to newer release from the beginning. OK so the source is really waiting for SPICE migration to finish according to the backtrace even though the new libvirt on the destination does not (correctly) send anything about graphics in the migration cookie. And this bug is about fixing the source, so you can't really test it by using the old unfixed libvirt on the source. You could try to update the source libvirtd, but keep old qemu-kvm-rhev there. (In reply to Jiri Denemark from comment #13) > OK so the source is really waiting for SPICE migration to finish according > to the backtrace even though the new libvirt on the destination does not > (correctly) send anything about graphics in the migration cookie. > > And this bug is about fixing the source, so you can't really test it by > using the old unfixed libvirt on the source. You could try to update the > source libvirtd, but keep old qemu-kvm-rhev there. Migration between newer libvirt versions or between older libvirt versions or from newer libvirt version to older libvirt version can always succeed, this bug only exists when migration from older libvirt version(RHEL7.2) to newer libvirt version. So I think if I can update libvirt version before do migration, the migration can succeed even before the fix. I don't really remember any details about this issue, but commit 4 suggests QEMU version matters too. Which is why I suggested updating just libvirt and using it with older QEMU. (In reply to Jiri Denemark from comment #15) > I don't really remember any details about this issue, but commit 4 suggests > QEMU version matters too. Which is why I suggested updating just libvirt and > using it with older QEMU. After update libvirt to latest version, spice migration can be completed. But I still think this bug can't be verified, because usually we don't update libvirt version when testing cross migration Hi Jirka Now, the test results are as below. It seems that the problem can only be fixed on source side(which should be not possible), should we just close this bug as won'tfix? Direction Result 7.2->7.3 Fail ===> This one is what's reported in comment 0. 7.2->7.4 Fail 7.3->7.2 Succeed 7.4->7.2 Succeed 7.2->7.2 Succeed 7.3->7.3 Succeed 7.4->7.4 Succeed I think CURRENTRELEASE would be a better reason. (In reply to Jiri Denemark from comment #18) > I think CURRENTRELEASE would be a better reason. OK, I have no objection. Will you do it? Or should I do it? |
Description of problem: Start a guest on RHEL7.2 host with spice listen on port '0', migrate it to RHEL7.3 host using virsh command, virsh command hangs after memory migration is 100% finished because it seems that spice migration can't finish. # virsh dumpxml rhel7.3-0817 ... <graphics type='spice' autoport='no' listen='0.0.0.0'> <listen type='address' address='0.0.0.0'/> </graphics> ... # virsh migrate rhel7.3-0817 qemu+ssh://10.66.4.152/system --live --verbose Migration: [100 %] Part of gstack output of libvirtd process: Thread 13 (Thread 0x7f8ddd672700 (LWP 29728)): #0 0x00007f8dec2ff6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f8deec913c6 in virCondWait (c=c@entry=0x7f8d7026ea38, m=m@entry=0x7f8d7026ea10) at util/virthread.c:154 #2 0x00007f8deecae132 in virDomainObjWait (vm=vm@entry=0x7f8d7026ea00) at conf/domain_conf.c:2674 #3 0x00007f8dce35e343 in qemuMigrationWaitForSpice (vm=0x7f8d7026ea00) at qemu/qemu_migration.c:2469 #4 qemuMigrationConfirmPhase (driver=driver@entry=0x7f8d701af860, conn=conn@entry=0x7f8dc400f6d0, vm=0x7f8d7026ea00, cookiein=cookiein@entry=0x7f8dc4018c30 "<qemu-migration>\n <name>rhel7.3-0817</name>\n <uuid>f16dd5c5-f0ca-40b2-82a2-38089cc0e12b</uuid>\n <hostname>fjin-4-152</hostname>\n <hostuuid>4f379b00-5c66-11e2-a064-10604b782a74</hostuuid>\n <statis"..., cookieinlen=cookieinlen@entry=1095, flags=flags@entry=257, retcode=retcode@entry=0) at qemu/qemu_migration.c:3809 #5 0x00007f8dce360bec in qemuMigrationConfirm (conn=0x7f8dc400f6d0, vm=0x7f8d7026ea00, cookiein=cookiein@entry=0x7f8dc4018c30 "<qemu-migration>\n <name>rhel7.3-0817</name>\n <uuid>f16dd5c5-f0ca-40b2-82a2-38089cc0e12b</uuid>\n <hostname>fjin-4-152</hostname>\n <hostuuid>4f379b00-5c66-11e2-a064-10604b782a74</hostuuid>\n <statis"..., cookieinlen=cookieinlen@entry=1095, flags=flags@entry=257, cancelled=cancelled@entry=0) at qemu/qemu_migration.c:3878 #6 0x00007f8dce38cf4f in qemuDomainMigrateConfirm3Params (domain=0x7f8dc4012530, params=<optimized out>, nparams=<optimized out>, cookiein=0x7f8dc4018c30 "<qemu-migration>\n <name>rhel7.3-0817</name>\n <uuid>f16dd5c5-f0ca-40b2-82a2-38089cc0e12b</uuid>\n <hostname>fjin-4-152</hostname>\n <hostuuid>4f379b00-5c66-11e2-a064-10604b782a74</hostuuid>\n <statis"..., cookieinlen=1095, flags=257, cancelled=0) at qemu/qemu_driver.c:12830 #7 0x00007f8deed2fc57 in virDomainMigrateConfirm3Params (domain=domain@entry=0x7f8dc4012530, params=params@entry=0x7f8dc40128b0, nparams=3, cookiein=0x7f8dc4018c30 "<qemu-migration>\n <name>rhel7.3-0817</name>\n <uuid>f16dd5c5-f0ca-40b2-82a2-38089cc0e12b</uuid>\n <hostname>fjin-4-152</hostname>\n <hostuuid>4f379b00-5c66-11e2-a064-10604b782a74</hostuuid>\n <statis"..., cookieinlen=1095, flags=257, cancelled=0) at libvirt-domain.c:5329 #8 0x00007f8def965a06 in remoteDispatchDomainMigrateConfirm3Params (server=<optimized out>, msg=0x7f8df15ae8b0, args=0x7f8dc4011960, rerr=0x7f8ddd671c30, client=<optimized out>) at remote.c:5672 #9 remoteDispatchDomainMigrateConfirm3ParamsHelper (server=<optimized out>, client=<optimized out>, msg=0x7f8df15ae8b0, rerr=0x7f8ddd671c30, args=0x7f8dc4011960, ret=0x7f8dc40119b0) at remote_dispatch.h:6585 #10 0x00007f8deed9bd82 in virNetServerProgramDispatchCall (msg=0x7f8df15ae8b0, client=0x7f8df15a41d0, server=0x7f8df157f050, prog=0x7f8df159f2d0) at rpc/virnetserverprogram.c:437 #11 virNetServerProgramDispatch (prog=0x7f8df159f2d0, server=server@entry=0x7f8df157f050, client=0x7f8df15a41d0, msg=0x7f8df15ae8b0) at rpc/virnetserverprogram.c:307 #12 0x00007f8deed96ffd in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x7f8df157f050) at rpc/virnetserver.c:135 #13 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7f8df157f050) at rpc/virnetserver.c:156 #14 0x00007f8deec91c35 in virThreadPoolWorker (opaque=opaque@entry=0x7f8df1573f60) at util/virthreadpool.c:145 #15 0x00007f8deec91158 in virThreadHelper (data=<optimized out>) at util/virthread.c:206 #16 0x00007f8dec2fbdc5 in start_thread () from /lib64/libpthread.so.0 #17 0x00007f8dec0291cd in clone () from /lib64/libc.so.6 Version-Release number of selected component (if applicable): RHEL7.2: libvirt-1.2.17-13.el7_2.5.x86_64 qemu-kvm-rhev-2.3.0-31.el7_2.22.x86_64 spice-server-0.12.4-15.el7_2.1.x86_64 RHEL7.3: libvirt-2.0.0-8.el7.x86_64 qemu-kvm-rhev-2.6.0-25.el7.x86_64 spice-server-0.12.4-18.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.Start guest on RHEl7.2 host 2.Migrate it to RHEL7.3 host Actual results: Spice migration can't finish Expected results: Spice migration can finish and virsh exits after migration finishes. Additional info: 1) Migration from RHEL7.2 to RHEL7.2 has no such problem 2) Migration from RHEL7.3 to RHEL7.3 has no such problem 3) When spice listen on a non-zero port, spice migration can finish from RHEL7.2 to RHEL7.3