Hide Forgot
Description of problem: Migrate with Bridged network, eth + macvtap + vepa, could not migrate back Version-Release number of selected component (if applicable): libvirt-0.9.10-6.el6.x86_64 kernel-2.6.32-244.el6.x86_64 qemu-kvm-0.12.1.2-2.241.el6.x86_64 How reproducible: 80% Steps to Reproduce: Setup migrate environment use of nfs. On both source and target host: 1. prepare the following vepa-network xml: # cat vepa-network.xml <network> <name>vepa-net</name> <forward dev='eth0' mode='vepa'> <interface dev='eth0'/> <interface dev='eth1'/> <interface dev='eth2'/> <interface dev='eth3'/> </forward> </network> 2. define and start vepa-network: # virsh net-define vepa-network.xml # virsh net-start vepa-net On source host: 3. start a guest with the following interface using vepa-net: <interface type='network'> <mac address='52:54:00:1b:6f:e5'/> <source network='vepa-net'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> 4. migrate # virsh migrate --live migrate qemu+ssh://${target_ip}/system On target host: 5. migrate back # virsh migrate --live migrate qemu+ssh://${source_ip}/system Actual results: executed step 4 and 5 repeatedly : On target host: 80% failed and get error information like: error: internal error guest unexpectedly quit Expected results: 100% success Additional info: I have capture the log of libvirtd as the attachment, but there is no error information in it. Have the same result when I use libvirt-0.9.9-2.el6.x86_64 and libvirt-0.9.10-5.el6.x86_64 source host : DELL OPTIPLEX 760(4 intel Q9400 core) target host: DELL OPTIPLEX 755 (2 intel E8400 core)
Created attachment 571332 [details] libvirtd.log
When I do migration with Bridged network just as eth + macvtap + bridge, it has the same result. just network xml like: <network> <name>bridge-net</name> <forward dev='eth0' mode='bridge'> <interface dev='eth0'/> </forward> </network> and guest interface like: <interface type='network'> <mac address='52:54:00:1b:6f:e5'/> <source network='bridge-net'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface>
The attached libvirtd.log is from the source host and it appears the migration was completed but then the daemon was notified that the domain could not be resumed on target host and thus the whole migration process was aborted. The reason why it failed on target host is most likely written in /var/log/libvirt/qemu/DOMAIN.log on target host. Also attaching libvirtd.log from target host might help us as well. Note, that source and target host terms I use have a bit different meaning from the terms used in bug description. Source/target are used from the point of view of the migration process, i.e., the machines swap their roles after step 4.
Yeah, I got the error log from the source host (which as the target host for migrate back). And I will add them as attachments.
Created attachment 571376 [details] the source (which as target for migrating back) libvirtd.log
Created attachment 571377 [details] the /var/log/libvirt/qemu/mig7.log of source machine(which as target for migrating back)
This is a qemu bug. I've managed to reproduce this by hand, without any libvirt intervention. On the source I've ran: LC_ALL=C PATH=/bin:/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin HOME=/ USER=root QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -S -M pc-1.1 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name f16_nfs -uuid 960f36c5-0b4b-07d0-936b-cd9775d8b526 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/f16_nfs.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot c -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/mnt/masina_nfs/f16_nfs.qcow2,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,ifname=tap0,script=no,downscript=no,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:54:63:1a,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 On the destination with -incoming part of course. The migration failed with: Unknown savevm section or instance 'kvm-tpr-opt' 0 load of migration failed even on the first try.
Orit agreed to look at this one.
I could not reproducing this issue with virsh 0.9.10 and qemu-kvm-0.12.2.1. Can you reproduce it error again?
(In reply to comment #13) > I could not reproducing this issue with virsh 0.9.10 and qemu-kvm-0.12.2.1. > > Can you reproduce it error again? where can I find the qemu-kvm-0.12.2.1 pkt?
I use http://download.devel.redhat.com/nightly/latest-RHEL6.3/6/Server/x86_64/os/Packages/
(In reply to comment #15) > I use > http://download.devel.redhat.com/nightly/latest-RHEL6.3/6/Server/x86_64/os/Packages/ But just qemu-kvm-0.12.1.2-2.265 on that web side. OK, I will try.
I can reproduce it but get the error info of "error: Requested operation is not valid: domain 'mig' is not processing incoming migration", in the versions of: libvirt-0.9.10-7.el6.x86_64 qemu-kvm-0.12.1.2-2.265.el6.x86_64 and there are error info in the /var/log/libvirt/qemu/mig.log of the source machine(which to migrate back).
(In reply to comment #17) > I can reproduce it but get the error info of "error: Requested operation is not > valid: domain 'mig' is not processing incoming migration", in the versions of: > > libvirt-0.9.10-7.el6.x86_64 > qemu-kvm-0.12.1.2-2.265.el6.x86_64 > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source > machine(which to migrate back). Can you attach libvirt log and qemu log for both hosts.
Created attachment 574819 [details] source_libvirtd.log
Created attachment 574820 [details] source_mig.log
Created attachment 574821 [details] target_libvirtd.log
Created attachment 574822 [details] target_mig.log
> (In reply to comment #17) > > I can reproduce it but get the error info of "error: Requested operation is not > > valid: domain 'mig' is not processing incoming migration", in the versions of: > > > > libvirt-0.9.10-7.el6.x86_64 > > qemu-kvm-0.12.1.2-2.265.el6.x86_64 > > > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source > > machine(which to migrate back). > > Can you attach libvirt log and qemu log for both hosts. As below attachments.
(In reply to comment #23) > > (In reply to comment #17) > > > I can reproduce it but get the error info of "error: Requested operation is not > > > valid: domain 'mig' is not processing incoming migration", in the versions of: > > > > > > libvirt-0.9.10-7.el6.x86_64 > > > qemu-kvm-0.12.1.2-2.265.el6.x86_64 > > > > > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source > > > machine(which to migrate back). > > > > Can you attach libvirt log and qemu log for both hosts. > > As below attachments. It looks like a different issue. Maybe you need to update the kernel too.
(In reply to comment #24) > (In reply to comment #23) > > > (In reply to comment #17) > > > > I can reproduce it but get the error info of "error: Requested operation is not > > > > valid: domain 'mig' is not processing incoming migration", in the versions of: > > > > > > > > libvirt-0.9.10-7.el6.x86_64 > > > > qemu-kvm-0.12.1.2-2.265.el6.x86_64 > > > > > > > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source > > > > machine(which to migrate back). > > > > > > Can you attach libvirt log and qemu log for both hosts. > > > > As below attachments. > > It looks like a different issue. > Maybe you need to update the kernel too. You mean that you need use the following versions: libvirt-0.9.10-7.el6.x86_64 qemu-kvm-0.12.1.2-2.265.el6.x86_64 kernel-2.6.32-244.el6.x86_64 to reproduce it? Or use versions of when I filed the bugs: libvirt-0.9.10-6.el6.x86_64 kernel-2.6.32-244.el6.x86_64 qemu-kvm-0.12.1.2-2.241.el6.x86_64 ? Because the versions in http://download.devel.redhat.com/nightly/latest-RHEL6.3/6/Server/x86_64/os/Packages/ is updated everyday. What's doubtless versions you want ? Please tell me clearly. Thanks.
(In reply to comment #25) > (In reply to comment #24) > > (In reply to comment #23) > > > > (In reply to comment #17) > > > > > I can reproduce it but get the error info of "error: Requested operation is not > > > > > valid: domain 'mig' is not processing incoming migration", in the versions of: > > > > > > > > > > libvirt-0.9.10-7.el6.x86_64 > > > > > qemu-kvm-0.12.1.2-2.265.el6.x86_64 > > > > > > > > > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source > > > > > machine(which to migrate back). > > > > > > > > Can you attach libvirt log and qemu log for both hosts. > > > > > > As below attachments. > > > > It looks like a different issue. > > Maybe you need to update the kernel too. > > You mean that you need use the following versions: > libvirt-0.9.10-7.el6.x86_64 > qemu-kvm-0.12.1.2-2.265.el6.x86_64 > kernel-2.6.32-244.el6.x86_64 > to reproduce it? > Or use versions of when I filed the bugs: > libvirt-0.9.10-6.el6.x86_64 > kernel-2.6.32-244.el6.x86_64 > qemu-kvm-0.12.1.2-2.241.el6.x86_64 ? > Because the versions in > http://download.devel.redhat.com/nightly/latest-RHEL6.3/6/Server/x86_64/os/Packages/ > is updated everyday. > What's doubtless versions you want ? > Please tell me clearly. Thanks. If the bug is related to VEPA than it should be reproduced in both version. What I meant is that the qemu-kvm and the kernel version may not match, because we use the nightly build.
(In reply to comment #24) > (In reply to comment #23) > > > (In reply to comment #17) > > > > I can reproduce it but get the error info of "error: Requested operation is not > > > > valid: domain 'mig' is not processing incoming migration", in the versions of: > > > > > > > > libvirt-0.9.10-7.el6.x86_64 > > > > qemu-kvm-0.12.1.2-2.265.el6.x86_64 > > > > > > > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source > > > > machine(which to migrate back). > > > > > > Can you attach libvirt log and qemu log for both hosts. > > > > As below attachments. > > It looks like a different issue. > Maybe you need to update the kernel too. I use the following versions: kernel-2.6.32-259.el6.x86_64 qemu-kvm-0.12.1.2-2.270.el6.x86_64 libvirt-0.9.10-11.el6.x86_64 to reproduce. It gave the same error of "domain 'mig' is not processing incoming migration" as the comment 17, and could not get the "error: internal error guest unexpectedly quit" which the bug gave. I thought that maybe it is the same reason to cause migration back failed. How do you think? And the log of the operation of one time successfully and one time unsuccessfully as follows:
Created attachment 576939 [details] source_libvirtd.log
Created attachment 576940 [details] source_mig.log
Created attachment 576941 [details] target_libvirtd.log
Created attachment 576942 [details] target_mig.log
Moved to RHEL 6.5 due to capacity
Retest with following two scenarios: --- Scenario 1(do ping-pong migration ten times): Test with vepa mode of macvtap, as followings: # ip link add link eth0 name macvtap-jun1 type macvtap mode vepa # ip link set macvtap-jun1 up --- src cli: /usr/libexec/qemu-kvm -S -machine rhel6.6.0,dump-guest-core=off -enable-kvm -m 8G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa68 -rtc base=localtime,clock=host,driftfix=slew -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/media/RHEL-Server-6.6-32.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=juli,bootindex=1 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4499,server,nowait -serial unix:/tmp/ttyS0,server,nowait -spice port=5931,disable-ticketing -vga qxl -monitor stdio -monitor tcp:0:7766,server,nowait -monitor unix:/tmp/monitor1,server,nowait -device e1000,id=vn1,netdev=dev2,mac=ce:c9:53:b6:1f:42 411<>/dev/tap6 -netdev tap,id=dev2,vhost=on,fd=411 --- dst cli: /usr/libexec/qemu-kvm -S -machine rhel6.6.0,dump-guest-core=off -enable-kvm -m 8G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa68 -rtc base=localtime,clock=host,driftfix=slew -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/media/RHEL-Server-6.6-32.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=juli,bootindex=1 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4499,server,nowait -serial unix:/tmp/ttyS0,server,nowait -spice port=5931,disable-ticketing -vga qxl -monitor stdio -monitor tcp:0:7766,server,nowait -monitor unix:/tmp/monitor1,server,nowait -device e1000,id=vn1,netdev=dev2,mac=5e:9f:a7:cb:10:84 411<>/dev/tap6 -netdev tap,id=dev2,vhost=on,fd=411 -incoming tcp::5800,server,nowait ==== Scenario 2(do ping-pong migration ten times): Test with bridge mode of macvtap, as followings: # ip link add link eth0 name macvtap-jun8 type macvtap mode bridge # ip link set macvtap-jun8 up --- src cli: /usr/libexec/qemu-kvm -S -machine rhel6.6.0,dump-guest-core=off -enable-kvm -m 8G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa68 -rtc base=localtime,clock=host,driftfix=slew -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/media/RHEL-Server-6.6-32.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=juli,bootindex=1 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4499,server,nowait -serial unix:/tmp/ttyS0,server,nowait -spice port=5931,disable-ticketing -vga qxl -monitor stdio -monitor tcp:0:7766,server,nowait -monitor unix:/tmp/monitor1,server,nowait -device e1000,id=vn1,netdev=dev2,mac=ae:67:d9:84:91:69 411<>/dev/tap7 -netdev tap,id=dev2,vhost=on,fd=411 --- dst cli: /usr/libexec/qemu-kvm -S -machine rhel6.6.0,dump-guest-core=off -enable-kvm -m 8G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa68 -rtc base=localtime,clock=host,driftfix=slew -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/media/RHEL-Server-6.6-32.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=juli,bootindex=1 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4499,server,nowait -serial unix:/tmp/ttyS0,server,nowait -spice port=5931,disable-ticketing -vga qxl -monitor stdio -monitor tcp:0:7766,server,nowait -monitor unix:/tmp/monitor1,server,nowait -device e1000,id=vn1,netdev=dev2,mac=96:2d:f5:36:0c:25 411<>/dev/tap7 -netdev tap,id=dev2,vhost=on,fd=411 -incoming tcp::5800,server,nowait ====== Above two scenarios(qemu-kvm and guest) are work well after ten times ping-pong migration. Version of components: qemu-kvm-0.12.1.2-2.430.el6.x86_64 2.6.32-492.el6.x86_64 Any further testing needed, free to update it in the bz. Best Regards, Jun Li
Thank you! The orig bug is either solved in the meantime, or was a config issue.