Bug 927176
Summary: | libvirt live migration got unexpected fail | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | EricLee <bili> | ||||||||
Component: | libvirt | Assignee: | Eric Blake <eblake> | ||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 7.0 | CC: | acathrow, cwei, dallan, dyuan, eblake, juzhang, lsu, michen, mzhan, quintela, qzhang, virt-maint, xuzhang, zhwang, zpeng | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | libvirt-1.0.5-2.el7 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 961034 (view as bug list) | Environment: | |||||||||
Last Closed: | 2014-06-13 10:46:28 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 961034 | ||||||||||
Attachments: |
|
Hi, Juan This bug can not be reproduced with direct qemu command line, only reproduce with libvirt. So could you or some libvirt developers help check whether this is a qemu-kvm issue? Thanks. Could you attach /var/log/libvirt/qemu/qcow2.log files from both the source and the destination hosts? I got different error with the newest libvirt(1.0.4-1.el7) and qemu-kvm(1.4.0-2.el7): # virsh migrate --live qcow2 qemu+ssh://10.66.84.118/system --verbose root.84.118's password: error: Unable to copy socket file handle: Invalid argument Source /var/log/libvirt/qemu/qcow2.log: # cat /var/log/libvirt/qemu/qcow2.log 2013-04-12 02:05:14.555+0000: starting up LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name qcow2 -S -M pc-1.3 -enable-kvm -m 1024 -smp 2,sockets=2,cores=1,threads=1 -uuid 896349b0-7c57-5418-69a9-0e21f99d11c0 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/qcow2.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/mnt/rhel64qcow2.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 char device redirected to /dev/pts/1 (label charserial0) destination /var/log/libvirt/qemu/qcow2.log: # cat /var/log/libvirt/qemu/qcow2.log 2013-04-12 02:05:36.036+0000: starting up LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name qcow2 -S -M pc-1.3 -enable-kvm -m 1024 -smp 2,sockets=2,cores=1,threads=1 -uuid 896349b0-7c57-5418-69a9-0e21f99d11c0 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/qcow2.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/mnt/rhel64qcow2.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -incoming tcp:0.0.0.0:49152 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 char device redirected to /dev/pts/2 (label charserial0) load of migration failed 2013-04-12 02:05:36.376+0000: shutting down Are the qemu logs from the original issue reported against libvirt-1.0.3-1.el7? Mixing several issues hit with several versions in a single bug report makes it a bit hard to analyze. Next two comments is the latest packages(qemu-kvm-1.4.0-2.el7.x86_64 and libvirt-1.0.4-1.el7.x86_64) libvirtd log, please have a check. Created attachment 736600 [details]
migrate-fail-source-libvirtd.log
Created attachment 736601 [details]
migrate-fail-dest-libvirtd.log
Eric, this is fixed by d6670a64, right? Yep - the source log has this line: 2013-04-17 01:50:38.757+0000: 524: error : virNetSocketDupFD:1063 : Unable to copy socket file handle: Invalid argument which is evidence of the bug fixed by this upstream commit: commit d6670a64e1067f29da3c3e032739e140280b763d Author: Daniel P. Berrange <berrange> Date: Fri May 3 11:10:50 2013 +0100 Fix F_DUPFD_CLOEXEC operation args The F_DUPFD_CLOEXEC operation with fcntl() expects a single int argument, specifying the minimum FD number for the newly dup'd file descriptor. We were not specifying that causing random stack data to be accessed as the FD number. Sometimes that worked, sometimes it didn't. Signed-off-by: Daniel P. Berrange <berrange> In POST; will be fixed by rebase. I test the migration with the latest libvirt packet,I found that the migration will crash the guest while i connect the guest with the spice in another terminal packets qemu-kvm-1.4.0-4.el7.x86_64 spice-server-0.12.2-5.1.el7.x86_64 kernel-3.9.0-0.55.el7.x86_64 libvirt-1.0.5-2.el7.x86_64 virt-manager-common-0.10.0-0.1.gitd3f9bc8e.el7.noarch virt-manager-0.10.0-0.1.gitd3f9bc8e.el7.noarch steps 1 start a guest in the source # virsh list Id Name State ---------------------------------------------------- 18 rhel72 running 2 connect the guest with spice in another terminal #remote-viewer spice://xx:xx:xx:xx:5900 3 migrate the guest to the target,the guest was not migrated to target and will be crashed here # virsh migrate --live rhel72 qemu+ssh://xx.xx.xx.xx/system --verbose root.xx.xx's password: Migration: [100 %]error: Unable to read from monitor: Connection reset by peer 4. if the guest migrate didn't with spice connecting ,the guest will be migrated successfully. 5 check the qemu log # cat /var/log/libvirt/qemu/rhel72.log main_channel_link: add main channel client main_channel_handle_parsed: net test: latency 0.374000 ms, bitrate 12337349397 bps (11765.813252 Mbps) inputs_connect: inputs channel client create red_dispatcher_set_cursor_peer: main_channel_client_handle_migrate_connected: client 0x7f44a5a00020 connected: 1 seamless 1 red_client_migrate: migrate client with #channels 4 ((null):7094): Spice-ERROR **: red_channel.c:1696:red_client_migrate: assertion `pthread_equal(pthread_self(), client->thread_id)' failed 2013-05-15 03:19:59.843+0000: shutting down so was this issue the same problem with this bug ? thanks This error comes from spice, not libvirt: ((null):7094): Spice-ERROR **: red_channel.c:1696:red_client_migrate: assertion `pthread_equal(pthread_self(), client->thread_id)' failed so it is not the same issue as what this BZ is fixing in libvirt, and you need to raise a separate bz for that problem can reproduce this with: libvirt-1.0.3-1.el7.x86_64 qemu-kvm-1.4.0-1.el7.x86_64 verify with libvirt-1.0.5-2.el7.x86_64 qemu-kvm-1.4.0-4.el7.x86_64 step same with description, migration worked as expect. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |
Created attachment 715938 [details] migrate-fail-libvirtd.log Description of problem: libvirt live migration got unexpected fail Version-Release number of selected component (if applicable): # rpm -qa libvirt qemu-kvm kernel kernel-3.7.0-0.36.el7.x86_64 libvirt-1.0.3-1.el7.x86_64 qemu-kvm-1.4.0-1.el7.x86_64 Reproduce steps: 1. prepare a migration environment. Guest with img on nfs server: <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/mnt/rhel64qcow2.img'> <seclabel model='selinux' relabel='no'/> </source> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> 2. # virsh migrate --live qcow2 qemu+ssh://10.66.84.118/system --verbose The authenticity of host '10.66.84.118 (10.66.84.118)' can't be established. RSA key fingerprint is bb:e7:a8:a5:86:648:62:ae:05:2e:99:75:0a:f6:4b. Are you sure you want to continue connecting (yes/no)? yes root.84.118's password: error: operation failed: migration job: unexpectedly failed Qemu cli: # ps aux | grep qemu qemu 1587 14.6 3.9 2049928 310980 ? Sl 04:35 2:19 /usr/libexec/qemu-kvm -name qcow2 -S -M pc-1.3 -enable-kvm -m 1024 -smp 2,sockets=2,cores=1,threads=1 -uuid 896349b0-7c57-5418-69a9-0e21f99d11c0 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/qcow2.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/mnt/rhel64qcow2.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 Actual results: As steps. Expected results: Should successful. Additional info: Detail logs please refer to attachment. Have debugged this error, seems it is qemu-kvm's problem, because I can not reproduce it with libvirt-1.0.3-1 and qemu-kvm-1.3.0-6, however, it can be reproduced with libvirt-1.0.2-1 and qemu-kvm-1.4.0-1. But I can not reproduce it just using qemu-cmd with newest qemu-kvm package. So it should because qemu-kvm has done some update, however libvirt have not.