Bug 804951 - Migrate with Bridged network, eth + macvtap + vepa, could not migrate back
Migrate with Bridged network, eth + macvtap + vepa, could not migrate back
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.3
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Amit Shah
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-20 05:45 EDT by EricLee
Modified: 2016-04-26 10:55 EDT (History)
19 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-07-23 06:27:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
libvirtd.log (1.83 MB, text/plain)
2012-03-20 05:51 EDT, EricLee
no flags Details
the source (which as target for migrating back) libvirtd.log (168.48 KB, text/plain)
2012-03-20 07:55 EDT, EricLee
no flags Details
the /var/log/libvirt/qemu/mig7.log of source machine(which as target for migrating back) (2.96 KB, text/plain)
2012-03-20 07:57 EDT, EricLee
no flags Details
source_libvirtd.log (16.13 MB, text/x-log)
2012-04-03 06:41 EDT, EricLee
no flags Details
source_mig.log (20.64 KB, text/x-log)
2012-04-03 06:42 EDT, EricLee
no flags Details
target_libvirtd.log (432.65 KB, text/x-log)
2012-04-03 06:43 EDT, EricLee
no flags Details
target_mig.log (16.01 KB, text/x-log)
2012-04-03 06:44 EDT, EricLee
no flags Details
source_libvirtd.log (2.39 MB, text/x-log)
2012-04-12 01:18 EDT, EricLee
no flags Details
source_mig.log (3.80 KB, text/x-log)
2012-04-12 01:18 EDT, EricLee
no flags Details
target_libvirtd.log (505.46 KB, text/x-log)
2012-04-12 01:19 EDT, EricLee
no flags Details
target_mig.log (6.20 KB, text/x-log)
2012-04-12 01:19 EDT, EricLee
no flags Details

  None (edit)
Description EricLee 2012-03-20 05:45:13 EDT
Description of problem:
Migrate with Bridged network, eth + macvtap + vepa, could not migrate back

Version-Release number of selected component (if applicable):
libvirt-0.9.10-6.el6.x86_64
kernel-2.6.32-244.el6.x86_64
qemu-kvm-0.12.1.2-2.241.el6.x86_64

How reproducible:
80%

Steps to Reproduce:

Setup migrate environment use of nfs.

On both source and target host:

1. prepare the following vepa-network xml:

# cat vepa-network.xml

<network>

  <name>vepa-net</name>

  <forward dev='eth0' mode='vepa'>

    <interface dev='eth0'/>

     <interface dev='eth1'/>

     <interface dev='eth2'/>

     <interface dev='eth3'/>

   </forward>

</network>

2. define and start vepa-network:

# virsh net-define vepa-network.xml

# virsh net-start vepa-net
 

On source  host:

3. start a guest with the following interface using vepa-net:

<interface type='network'>

  <mac address='52:54:00:1b:6f:e5'/>

  <source network='vepa-net'/>

  <target dev='vnet0'/>

  <model type='virtio'/>

  <alias name='net0'/>

  <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>

</interface>

4. migrate
# virsh migrate --live migrate qemu+ssh://${target_ip}/system

On target  host:

5. migrate back

# virsh migrate --live migrate qemu+ssh://${source_ip}/system 
  

Actual results:
executed step 4 and 5 repeatedly :
On target host:
80% failed and get error information like:
error: internal error guest unexpectedly quit

Expected results:
100% success

Additional info:

I have capture the log of libvirtd as the attachment, but there is no error information in it.

Have the same result when I use libvirt-0.9.9-2.el6.x86_64 and libvirt-0.9.10-5.el6.x86_64

source host : DELL OPTIPLEX 760(4 intel Q9400 core)
target host: DELL OPTIPLEX 755 (2 intel E8400 core)
Comment 1 EricLee 2012-03-20 05:51:27 EDT
Created attachment 571332 [details]
libvirtd.log
Comment 3 EricLee 2012-03-20 06:16:08 EDT
When I do migration with Bridged network just as eth + macvtap + bridge, it has
the same result.
just network xml like:

<network>

  <name>bridge-net</name>

  <forward dev='eth0' mode='bridge'>

  <interface dev='eth0'/> </forward>

</network>

and guest interface like:

<interface type='network'>

  <mac address='52:54:00:1b:6f:e5'/>

  <source network='bridge-net'/>

  <target dev='vnet0'/>

  <model type='virtio'/>

  <alias name='net0'/>

  <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>

</interface>
Comment 4 Jiri Denemark 2012-03-20 06:41:44 EDT
The attached libvirtd.log is from the source host and it appears the migration was completed but then the daemon was notified that the domain could not be resumed on target host and thus the whole migration process was aborted. The reason why it failed on target host is most likely written in /var/log/libvirt/qemu/DOMAIN.log on target host. Also attaching libvirtd.log from target host might help us as well.

Note, that source and target host terms I use have a bit different meaning from the terms used in bug description. Source/target are used from the point of view of the migration process, i.e., the machines swap their roles after step 4.
Comment 5 EricLee 2012-03-20 07:48:46 EDT
Yeah, I got the error log from the source host (which as the target host for migrate back). And I will add them as attachments.
Comment 6 EricLee 2012-03-20 07:55:30 EDT
Created attachment 571376 [details]
the source (which as target for migrating back) libvirtd.log
Comment 7 EricLee 2012-03-20 07:57:41 EDT
Created attachment 571377 [details]
the /var/log/libvirt/qemu/mig7.log of source machine(which as target for migrating back)
Comment 9 Michal Privoznik 2012-03-21 11:55:20 EDT
This is a qemu bug. I've managed to reproduce this by hand, without any libvirt intervention. On the source I've ran:

LC_ALL=C PATH=/bin:/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin HOME=/ USER=root QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -S -M pc-1.1 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name f16_nfs -uuid 960f36c5-0b4b-07d0-936b-cd9775d8b526 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/f16_nfs.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot c -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/mnt/masina_nfs/f16_nfs.qcow2,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,ifname=tap0,script=no,downscript=no,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:54:63:1a,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

On the destination with -incoming part of course. The migration failed with:

Unknown savevm section or instance 'kvm-tpr-opt' 0
load of migration failed

even on the first try.
Comment 11 Karen Noel 2012-03-28 08:22:51 EDT
Orit agreed to look at this one.
Comment 13 Orit Wasserman 2012-04-01 04:36:18 EDT
I could not reproducing this issue with virsh 0.9.10 and qemu-kvm-0.12.2.1.

Can you reproduce it error again?
Comment 14 EricLee 2012-04-01 05:44:19 EDT
(In reply to comment #13)
> I could not reproducing this issue with virsh 0.9.10 and qemu-kvm-0.12.2.1.
> 
> Can you reproduce it error again?

where can I find the qemu-kvm-0.12.2.1 pkt?
Comment 16 EricLee 2012-04-01 06:21:47 EDT
(In reply to comment #15)
> I use
> http://download.devel.redhat.com/nightly/latest-RHEL6.3/6/Server/x86_64/os/Packages/

But just qemu-kvm-0.12.1.2-2.265 on that web side.
OK, I will try.
Comment 17 EricLee 2012-04-01 06:57:09 EDT
I can reproduce it but get the error info of "error: Requested operation is not valid: domain 'mig' is not processing incoming migration", in the versions of:

libvirt-0.9.10-7.el6.x86_64
qemu-kvm-0.12.1.2-2.265.el6.x86_64

and there are error info in the /var/log/libvirt/qemu/mig.log of the source machine(which to migrate back).
Comment 18 Orit Wasserman 2012-04-01 07:08:33 EDT
(In reply to comment #17)
> I can reproduce it but get the error info of "error: Requested operation is not
> valid: domain 'mig' is not processing incoming migration", in the versions of:
> 
> libvirt-0.9.10-7.el6.x86_64
> qemu-kvm-0.12.1.2-2.265.el6.x86_64
> 
> and there are error info in the /var/log/libvirt/qemu/mig.log of the source
> machine(which to migrate back).

Can you attach libvirt log and qemu log for both hosts.
Comment 19 EricLee 2012-04-03 06:41:12 EDT
Created attachment 574819 [details]
source_libvirtd.log
Comment 20 EricLee 2012-04-03 06:42:33 EDT
Created attachment 574820 [details]
source_mig.log
Comment 21 EricLee 2012-04-03 06:43:32 EDT
Created attachment 574821 [details]
target_libvirtd.log
Comment 22 EricLee 2012-04-03 06:44:57 EDT
Created attachment 574822 [details]
target_mig.log
Comment 23 EricLee 2012-04-03 06:48:00 EDT
> (In reply to comment #17)
> > I can reproduce it but get the error info of "error: Requested operation is not
> > valid: domain 'mig' is not processing incoming migration", in the versions of:
> > 
> > libvirt-0.9.10-7.el6.x86_64
> > qemu-kvm-0.12.1.2-2.265.el6.x86_64
> > 
> > and there are error info in the /var/log/libvirt/qemu/mig.log of the source
> > machine(which to migrate back).
> 
> Can you attach libvirt log and qemu log for both hosts.

As below attachments.
Comment 24 Orit Wasserman 2012-04-11 03:16:47 EDT
(In reply to comment #23)
> > (In reply to comment #17)
> > > I can reproduce it but get the error info of "error: Requested operation is not
> > > valid: domain 'mig' is not processing incoming migration", in the versions of:
> > > 
> > > libvirt-0.9.10-7.el6.x86_64
> > > qemu-kvm-0.12.1.2-2.265.el6.x86_64
> > > 
> > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source
> > > machine(which to migrate back).
> > 
> > Can you attach libvirt log and qemu log for both hosts.
> 
> As below attachments.

It looks like a different issue.
Maybe you need to update the kernel too.
Comment 25 EricLee 2012-04-11 22:44:31 EDT
(In reply to comment #24)
> (In reply to comment #23)
> > > (In reply to comment #17)
> > > > I can reproduce it but get the error info of "error: Requested operation is not
> > > > valid: domain 'mig' is not processing incoming migration", in the versions of:
> > > > 
> > > > libvirt-0.9.10-7.el6.x86_64
> > > > qemu-kvm-0.12.1.2-2.265.el6.x86_64
> > > > 
> > > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source
> > > > machine(which to migrate back).
> > > 
> > > Can you attach libvirt log and qemu log for both hosts.
> > 
> > As below attachments.
> 
> It looks like a different issue.
> Maybe you need to update the kernel too.

You mean that you need use the following versions:
libvirt-0.9.10-7.el6.x86_64
qemu-kvm-0.12.1.2-2.265.el6.x86_64
kernel-2.6.32-244.el6.x86_64
to reproduce it?
Or use versions of when I filed the bugs:
libvirt-0.9.10-6.el6.x86_64
kernel-2.6.32-244.el6.x86_64
qemu-kvm-0.12.1.2-2.241.el6.x86_64 ?
Because the versions in http://download.devel.redhat.com/nightly/latest-RHEL6.3/6/Server/x86_64/os/Packages/ is updated everyday.
What's doubtless versions you want ? 
Please tell me clearly. Thanks.
Comment 26 Orit Wasserman 2012-04-12 01:00:14 EDT
(In reply to comment #25)
> (In reply to comment #24)
> > (In reply to comment #23)
> > > > (In reply to comment #17)
> > > > > I can reproduce it but get the error info of "error: Requested operation is not
> > > > > valid: domain 'mig' is not processing incoming migration", in the versions of:
> > > > > 
> > > > > libvirt-0.9.10-7.el6.x86_64
> > > > > qemu-kvm-0.12.1.2-2.265.el6.x86_64
> > > > > 
> > > > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source
> > > > > machine(which to migrate back).
> > > > 
> > > > Can you attach libvirt log and qemu log for both hosts.
> > > 
> > > As below attachments.
> > 
> > It looks like a different issue.
> > Maybe you need to update the kernel too.
> 
> You mean that you need use the following versions:
> libvirt-0.9.10-7.el6.x86_64
> qemu-kvm-0.12.1.2-2.265.el6.x86_64
> kernel-2.6.32-244.el6.x86_64
> to reproduce it?
> Or use versions of when I filed the bugs:
> libvirt-0.9.10-6.el6.x86_64
> kernel-2.6.32-244.el6.x86_64
> qemu-kvm-0.12.1.2-2.241.el6.x86_64 ?
> Because the versions in
> http://download.devel.redhat.com/nightly/latest-RHEL6.3/6/Server/x86_64/os/Packages/
> is updated everyday.
> What's doubtless versions you want ? 
> Please tell me clearly. Thanks.
If the bug is related to VEPA than it should be reproduced in both version.

What I meant is that the qemu-kvm and the kernel version may not match,
because we use the nightly build.
Comment 27 EricLee 2012-04-12 01:16:56 EDT
(In reply to comment #24)
> (In reply to comment #23)
> > > (In reply to comment #17)
> > > > I can reproduce it but get the error info of "error: Requested operation is not
> > > > valid: domain 'mig' is not processing incoming migration", in the versions of:
> > > > 
> > > > libvirt-0.9.10-7.el6.x86_64
> > > > qemu-kvm-0.12.1.2-2.265.el6.x86_64
> > > > 
> > > > and there are error info in the /var/log/libvirt/qemu/mig.log of the source
> > > > machine(which to migrate back).
> > > 
> > > Can you attach libvirt log and qemu log for both hosts.
> > 
> > As below attachments.
> 
> It looks like a different issue.
> Maybe you need to update the kernel too.

I use the following versions:
kernel-2.6.32-259.el6.x86_64
qemu-kvm-0.12.1.2-2.270.el6.x86_64
libvirt-0.9.10-11.el6.x86_64
to reproduce. 
It gave the same error of "domain 'mig' is not processing incoming migration" as the comment 17, and could not get the "error: internal error guest unexpectedly quit" which the bug gave. 
I thought that maybe it is the same reason to cause migration back failed.
How do you think?

And the log of the operation of one time successfully and one time unsuccessfully as follows:
Comment 28 EricLee 2012-04-12 01:18:29 EDT
Created attachment 576939 [details]
source_libvirtd.log
Comment 29 EricLee 2012-04-12 01:18:51 EDT
Created attachment 576940 [details]
source_mig.log
Comment 30 EricLee 2012-04-12 01:19:17 EDT
Created attachment 576941 [details]
target_libvirtd.log
Comment 31 EricLee 2012-04-12 01:19:42 EDT
Created attachment 576942 [details]
target_mig.log
Comment 33 Orit Wasserman 2012-07-11 10:10:49 EDT
Moved to RHEL 6.5 due to capacity
Comment 35 Jun Li 2014-07-23 06:21:40 EDT
Retest with following two scenarios:
---
Scenario 1(do ping-pong migration ten times):
Test with vepa mode of macvtap, as followings:
# ip link  add link eth0 name macvtap-jun1 type macvtap mode vepa
# ip link set macvtap-jun1 up
---
src cli:
/usr/libexec/qemu-kvm -S -machine rhel6.6.0,dump-guest-core=off -enable-kvm -m 8G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa68 -rtc base=localtime,clock=host,driftfix=slew -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/media/RHEL-Server-6.6-32.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native  -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=juli,bootindex=1 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4499,server,nowait -serial unix:/tmp/ttyS0,server,nowait -spice port=5931,disable-ticketing  -vga qxl -monitor stdio -monitor tcp:0:7766,server,nowait -monitor unix:/tmp/monitor1,server,nowait -device e1000,id=vn1,netdev=dev2,mac=ce:c9:53:b6:1f:42 411<>/dev/tap6 -netdev tap,id=dev2,vhost=on,fd=411
---
dst cli:
/usr/libexec/qemu-kvm -S -machine rhel6.6.0,dump-guest-core=off -enable-kvm -m 8G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa68 -rtc base=localtime,clock=host,driftfix=slew -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/media/RHEL-Server-6.6-32.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native  -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=juli,bootindex=1 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4499,server,nowait -serial unix:/tmp/ttyS0,server,nowait -spice port=5931,disable-ticketing  -vga qxl -monitor stdio -monitor tcp:0:7766,server,nowait -monitor unix:/tmp/monitor1,server,nowait -device e1000,id=vn1,netdev=dev2,mac=5e:9f:a7:cb:10:84 411<>/dev/tap6 -netdev tap,id=dev2,vhost=on,fd=411 -incoming tcp::5800,server,nowait
====
Scenario 2(do ping-pong migration ten times):
Test with bridge mode of macvtap, as followings:
# ip link  add link eth0 name macvtap-jun8 type macvtap mode bridge
# ip link set macvtap-jun8 up
---
src cli:
/usr/libexec/qemu-kvm -S -machine rhel6.6.0,dump-guest-core=off -enable-kvm -m 8G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa68 -rtc base=localtime,clock=host,driftfix=slew -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/media/RHEL-Server-6.6-32.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native  -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=juli,bootindex=1 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4499,server,nowait -serial unix:/tmp/ttyS0,server,nowait -spice port=5931,disable-ticketing  -vga qxl -monitor stdio -monitor tcp:0:7766,server,nowait -monitor unix:/tmp/monitor1,server,nowait -device e1000,id=vn1,netdev=dev2,mac=ae:67:d9:84:91:69 411<>/dev/tap7 -netdev tap,id=dev2,vhost=on,fd=411
---
dst cli:
/usr/libexec/qemu-kvm -S -machine rhel6.6.0,dump-guest-core=off -enable-kvm -m 8G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa68 -rtc base=localtime,clock=host,driftfix=slew -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/media/RHEL-Server-6.6-32.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native  -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=juli,bootindex=1 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4499,server,nowait -serial unix:/tmp/ttyS0,server,nowait -spice port=5931,disable-ticketing  -vga qxl -monitor stdio -monitor tcp:0:7766,server,nowait -monitor unix:/tmp/monitor1,server,nowait -device e1000,id=vn1,netdev=dev2,mac=96:2d:f5:36:0c:25 411<>/dev/tap7 -netdev tap,id=dev2,vhost=on,fd=411 -incoming tcp::5800,server,nowait
======
Above two scenarios(qemu-kvm and guest) are work well after ten times ping-pong migration.

Version of components:
qemu-kvm-0.12.1.2-2.430.el6.x86_64
2.6.32-492.el6.x86_64

Any further testing needed, free to update it in the bz.

Best Regards,
Jun Li
Comment 36 Amit Shah 2014-07-23 06:27:49 EDT
Thank you!

The orig bug is either solved in the meantime, or was a config issue.

Note You need to log in before you can comment on or make changes to this bug.