Bug 1331322 - [Memory ballooning]Dest guest can't load migration of hot unplugging memory balloon device
Summary: [Memory ballooning]Dest guest can't load migration of hot unplugging memory b...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.3
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-28 09:52 UTC by aihua liang
Modified: 2016-06-29 13:20 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-29 13:20:31 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description aihua liang 2016-04-28 09:52:46 UTC
Description of problem:
 Dest guest can't load migration of hot unplugging memory balloon device

Version-Release number of selected component (if applicable):
  kernel version:3.10.0-382.el7.x86_64
  qemu version: qemu-kvm-rhev-2.5.0-4.el7.x86_64
  seabios version: seabios-1.9.1-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
 1.Boot both source and dest guest with memory balloon device added.
 2.Hot unplug memory balloon device from source guest.
 3.Do live migration from source guest to dest guest.
 4.Check if dest guest can work well after migration.
*************************
Details as bellow:
  a.qemu-kvm commands:
source guest:
/usr/libexec/qemu-kvm \
-M pc \
-name rhel7.3-4 \
-machine pc,accel=kvm,usb=off,vmport=off \
-cpu host \
-m 4G \
-smp 8,sockets=8,cores=1,threads=1 \
-uuid 1534fa42-4818-4493-9f67-eee5ba758385 \
-nodefaults -nodefconfig -no-user-config \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
-mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
-boot menu=on,splash-time=10000 \
-drive file=/tmp/target1.qcow2,if=none,id=drive-system-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,addr=0xB \
-enable-kvm \
-monitor stdio \
-netdev tap,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=58:54:00:49:b2:5f,addr=0x3 \
-vnc 0:2 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,id=ba \

dest guest:
/usr/libexec/qemu-kvm \
-M pc \
-name rhel7.3-4 \
-machine pc,accel=kvm,usb=off,vmport=off \
-cpu host \
-m 4G \
-smp 8,sockets=8,cores=1,threads=1 \
-uuid 1534fa42-4818-4493-9f67-eee5ba758385 \
-nodefaults -nodefconfig -no-user-config \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
-mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
-boot menu=on,splash-time=10000 \
-drive file=/tmp/target1.qcow2,if=none,id=drive-system-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,addr=0xB \
-enable-kvm \
-monitor stdio \
-netdev tap,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=58:54:00:49:b2:5f,addr=0x3 \
-vnc 0:2 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,id=ba \
-incoming tcp:0:4446 

  b.hmp commands for migration:(for source guest)
(qemu)device_del "ba"                   
(qemu)info balloon                     --->No balloon device                    
(qemu)migrate -d tcp:$dest_host_ip:4446  
(qemu)info status                      --->paused(postmigrate)
**************************************************************
        
Actual results:
 Dest guest can't load the migration with the following error message:
    qemu-kvm:Unknown savevm section or instance "0000:00:02.0/virtio-balloon 0"
    qemu-kvm load of migration failed:invalid argument.
 
Expected results:
  Dest guest can load migration of hot unplugging memory balloon device.

Additional info:

Comment 2 Dr. David Alan Gilbert 2016-06-23 16:27:17 UTC
Hi,
  Can you check that the source has really hot unplugged the guest.
After you do your:
    device_del "ba"

do an;
    info pci

  if the device is still there then migration will still expect it.
What was the guest booted into when you tired this - was it already booted into the guest OS?

Comment 3 aihua liang 2016-06-24 09:46:00 UTC
(In reply to Dr. David Alan Gilbert from comment #2)
> Hi,
>   Can you check that the source has really hot unplugged the guest.
> After you do your:
>     device_del "ba"
> 
> do an;
>     info pci
> 
>   if the device is still there then migration will still expect it.
> What was the guest booted into when you tired this - was it already booted
> into the guest OS?

Operate as requested, results as bellow:
(qemu) device_del ba
(qemu) 
(qemu) 
(qemu) info pci
  Bus  0, device   0, function 0:
    Host bridge: PCI device 8086:1237
      id ""
  Bus  0, device   1, function 0:
    ISA bridge: PCI device 8086:7000
      id ""
  Bus  0, device   1, function 1:
    IDE controller: PCI device 8086:7010
      BAR4: I/O at 0xc060 [0xc06f].
      id ""
  Bus  0, device   1, function 3:
    Bridge: PCI device 8086:7113
      IRQ 9.
      id ""
  Bus  0, device   3, function 0:
    Ethernet controller: PCI device 1af4:1000
      IRQ 11.
      BAR0: I/O at 0xc020 [0xc03f].
      BAR1: 32 bit memory at 0xfc052000 [0xfc052fff].
      BAR6: 32 bit memory at 0xffffffffffffffff [0x0003fffe].
      id "net0"
  Bus  0, device  11, function 0:
    VGA controller: PCI device 1b36:0100
      IRQ 11.
      BAR0: 32 bit memory at 0xf4000000 [0xf7ffffff].
      BAR1: 32 bit memory at 0xf8000000 [0xfbffffff].
      BAR2: 32 bit memory at 0xfc050000 [0xfc051fff].
      BAR3: I/O at 0xc040 [0xc05f].
      BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe].
      id "video0"
(qemu) info balloon 
No balloon device has been activated
(qemu) 

 When i do migration, guest os is already entered.

Comment 4 aihua liang 2016-06-24 09:50:11 UTC
(In reply to aihua liang from comment #3)
> (In reply to Dr. David Alan Gilbert from comment #2)
> > Hi,
> >   Can you check that the source has really hot unplugged the guest.
> > After you do your:
> >     device_del "ba"
> > 
> > do an;
> >     info pci
> > 
> >   if the device is still there then migration will still expect it.
> > What was the guest booted into when you tired this - was it already booted
> > into the guest OS?
> 
> Operate as requested, results as bellow:
> (qemu) device_del ba
> (qemu) 
> (qemu) 
> (qemu) info pci
>   Bus  0, device   0, function 0:
>     Host bridge: PCI device 8086:1237
>       id ""
>   Bus  0, device   1, function 0:
>     ISA bridge: PCI device 8086:7000
>       id ""
>   Bus  0, device   1, function 1:
>     IDE controller: PCI device 8086:7010
>       BAR4: I/O at 0xc060 [0xc06f].
>       id ""
>   Bus  0, device   1, function 3:
>     Bridge: PCI device 8086:7113
>       IRQ 9.
>       id ""
>   Bus  0, device   3, function 0:
>     Ethernet controller: PCI device 1af4:1000
>       IRQ 11.
>       BAR0: I/O at 0xc020 [0xc03f].
>       BAR1: 32 bit memory at 0xfc052000 [0xfc052fff].
>       BAR6: 32 bit memory at 0xffffffffffffffff [0x0003fffe].
>       id "net0"
>   Bus  0, device  11, function 0:
>     VGA controller: PCI device 1b36:0100
>       IRQ 11.
>       BAR0: 32 bit memory at 0xf4000000 [0xf7ffffff].
>       BAR1: 32 bit memory at 0xf8000000 [0xfbffffff].
>       BAR2: 32 bit memory at 0xfc050000 [0xfc051fff].
>       BAR3: I/O at 0xc040 [0xc05f].
>       BAR6: 32 bit memory at 0xffffffffffffffff [0x0000fffe].
>       id "video0"
> (qemu) info balloon 
> No balloon device has been activated
> (qemu) 
> 
>  When i do migration, guest os is already entered.
   
   BTW,qemu cmds given about balloon:
    Wrong cmdline: -device virtio-balloon-pci,id=balloon0,bus=pci.0,id=ba \
    Correct cmdline: -device virtio-balloon-pci,bus=pci.0,id=ba \
   
   **Both script in src and dst need to be updated to the correct one.**

Comment 5 Dr. David Alan Gilbert 2016-06-24 11:00:26 UTC
Interesting;  I've tried repeating it on the current 2.6.0-9 package and it seems OK from the test I did; I was using a slightly simpler line:


/usr/libexec/qemu-kvm -nographic -M pc,accel=kvm -m 4G -device virtio-balloon-pci,bus=pci.0,id=ba  /home/vms/f20.qcow2

can you retest it with that version pelase and also just check that the device_del is being done on the source side;  if it still fails, are you using a script to drive this and if so I'd appreciate a copy of the script.

Comment 6 aihua liang 2016-06-28 09:54:45 UTC
(In reply to Dr. David Alan Gilbert from comment #5)
> Interesting;  I've tried repeating it on the current 2.6.0-9 package and it
> seems OK from the test I did; I was using a slightly simpler line:
> 
> 
> /usr/libexec/qemu-kvm -nographic -M pc,accel=kvm -m 4G -device
> virtio-balloon-pci,bus=pci.0,id=ba  /home/vms/f20.qcow2
> 
> can you retest it with that version pelase and also just check that the
> device_del is being done on the source side;  if it still fails, are you
> using a script to drive this and if so I'd appreciate a copy of the script.

Sorry for seeing the mail today,i‘ll try again and give a response later.

Comment 7 aihua liang 2016-06-28 12:38:55 UTC
(In reply to aihua liang from comment #6)
> (In reply to Dr. David Alan Gilbert from comment #5)
> > Interesting;  I've tried repeating it on the current 2.6.0-9 package and it
> > seems OK from the test I did; I was using a slightly simpler line:
> > 
> > 
> > /usr/libexec/qemu-kvm -nographic -M pc,accel=kvm -m 4G -device
> > virtio-balloon-pci,bus=pci.0,id=ba  /home/vms/f20.qcow2
> > 
> > can you retest it with that version pelase and also just check that the
> > device_del is being done on the source side;  if it still fails, are you
> > using a script to drive this and if so I'd appreciate a copy of the script.
> 
> Sorry for seeing the mail today,i‘ll try again and give a response later.

I have tested in qemu-kvm-rhev 2.6.0-8, this problem not exist when i do hot unplug virtio balloon device.

But when i hot plug a memory balloon device on source, then i do live migration, the same error info still exist in dst.
*****souce operation************
 (qemu)device_add virtio-balloon-pci,id=balloon0 
 (qemu)migrate -d tcp:$dest_ip:port
 (qemu)info status           ----->paused(postmigrate)

*****error message in dst******
  qemu-kvm:Unknown savevm section or instance "0000:00:02.0/virtio-balloon 0"
  qemu-kvm load of migration failed:invalid argument.

Comment 8 Dr. David Alan Gilbert 2016-06-28 13:45:40 UTC
(In reply to aihua liang from comment #7)
> (In reply to aihua liang from comment #6)
> > (In reply to Dr. David Alan Gilbert from comment #5)
> > > Interesting;  I've tried repeating it on the current 2.6.0-9 package and it
> > > seems OK from the test I did; I was using a slightly simpler line:
> > > 
> > > 
> > > /usr/libexec/qemu-kvm -nographic -M pc,accel=kvm -m 4G -device
> > > virtio-balloon-pci,bus=pci.0,id=ba  /home/vms/f20.qcow2
> > > 
> > > can you retest it with that version pelase and also just check that the
> > > device_del is being done on the source side;  if it still fails, are you
> > > using a script to drive this and if so I'd appreciate a copy of the script.
> > 
> > Sorry for seeing the mail today,i‘ll try again and give a response later.
> 
> I have tested in qemu-kvm-rhev 2.6.0-8, this problem not exist when i do hot
> unplug virtio balloon device.
> 
> But when i hot plug a memory balloon device on source, then i do live
> migration, the same error info still exist in dst.
> *****souce operation************
>  (qemu)device_add virtio-balloon-pci,id=balloon0 
>  (qemu)migrate -d tcp:$dest_ip:port
>  (qemu)info status           ----->paused(postmigrate)
> 
> *****error message in dst******
>   qemu-kvm:Unknown savevm section or instance "0000:00:02.0/virtio-balloon 0"
>   qemu-kvm load of migration failed:invalid argument.

Make sure you use an addr= field on both the device_add and on the qemu command line on the destination to ensure that they end up in the same PCI slot.
If it still fails please show me the source/destination qemu command lines and matching device_add.

Comment 9 aihua liang 2016-06-29 03:09:32 UTC
 No relationship with "addr".
    When i start dest guest without virtio balloon device, migration of hot plug virtio balloon device will fail.
1)Qemu-cmds:
  src-guest:
/usr/libexec/qemu-kvm \
-M pc \
-name rhel7.3-4 \
-machine pc,accel=kvm,usb=off,vmport=off \
-cpu host \
-m 4G \
-smp 8,sockets=8,cores=1,threads=1 \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
-mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
-drive file=/home/73test/img/se_test.qcow2,if=none,id=drive-system-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
-vga qxl \
-enable-kvm \
-monitor stdio \
-vnc 0:20 \

  dest-guest:
/usr/libexec/qemu-kvm \
-M pc \
-name rhel7.3-4 \
-machine pc,accel=kvm,usb=off,vmport=off \
-cpu host \
-m 4G \
-smp 8,sockets=8,cores=1,threads=1 \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
-mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
-drive file=/home/balloon/img/se_test.qcow2,if=none,id=drive-system-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
-vga qxl \
-enable-kvm \
-monitor stdio \
-vnc 0:20 \
-incoming tcp:0:4000 \

2)Hmp cmds on src:
  (qemu)device_add virtio-balloon-pci,id=ba   ---->discovery virtio balloon device via cmd "lspci" in guest.
  (qemu)migrate -d tcp:$dest_ip:port
  (qemu)info status      --->status:paused

3) Result: 
    Migration fail with error message:
     qemu-kvm:Unknown savevm section or instance "0000:00:02.0/virtio-balloon 0"
     qemu-kvm load of migration failed:invalid argument.


 When i start dest guest with virtio balloon device, migration of hot plug virtio balloon device will success.
1)Qemu-cmds:
  src-guest:
/usr/libexec/qemu-kvm \
-M pc \
-name rhel7.3-4 \
-machine pc,accel=kvm,usb=off,vmport=off \
-cpu host \
-m 4G \
-smp 8,sockets=8,cores=1,threads=1 \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
-mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
-drive file=/home/73test/img/se_test.qcow2,if=none,id=drive-system-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
-vga qxl \
-enable-kvm \
-monitor stdio \
-vnc 0:20 \

  dest-guest:
/usr/libexec/qemu-kvm \
-M pc \
-name rhel7.3-4 \
-machine pc,accel=kvm,usb=off,vmport=off \
-cpu host \
-m 4G \
-smp 8,sockets=8,cores=1,threads=1 \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
-mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
-drive file=/home/balloon/img/se_test.qcow2,if=none,id=drive-system-disk0,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
-vga qxl \
-enable-kvm \
-monitor stdio \
-vnc 0:20 \
-incoming tcp:0:4000 \
-virtio balloon 
 
2)Hmp cmds on src:
  (qemu)device_add virtio-balloon-pci,id=ba   ---->discovery virtio balloon device via cmd "lspci" in guest.
  (qemu)migrate -d tcp:$dest_ip:port
  (qemu)info status      --->status:paused
3) Result: 
    Migration success.

Hot plug migration fail when starting dest guest without balloon device 
But hot unplug migration success when starting dest guest with balloon device
Any foundational difference between these two operations?

Comment 10 Dr. David Alan Gilbert 2016-06-29 13:20:31 UTC
(In reply to aihua liang from comment #9)
>  No relationship with "addr".
>     When i start dest guest without virtio balloon device, migration of hot
> plug virtio balloon device will fail.
> 1)Qemu-cmds:
>   src-guest:
> /usr/libexec/qemu-kvm \
> -M pc \
> -name rhel7.3-4 \
> -machine pc,accel=kvm,usb=off,vmport=off \
> -cpu host \
> -m 4G \
> -smp 8,sockets=8,cores=1,threads=1 \
> -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
> -mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
> -drive
> file=/home/73test/img/se_test.qcow2,if=none,id=drive-system-disk0,
> format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
> -device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
> -vga qxl \
> -enable-kvm \
> -monitor stdio \
> -vnc 0:20 \
> 
>   dest-guest:
> /usr/libexec/qemu-kvm \
> -M pc \
> -name rhel7.3-4 \
> -machine pc,accel=kvm,usb=off,vmport=off \
> -cpu host \
> -m 4G \
> -smp 8,sockets=8,cores=1,threads=1 \
> -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
> -mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
> -drive
> file=/home/balloon/img/se_test.qcow2,if=none,id=drive-system-disk0,
> format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
> -device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
> -vga qxl \
> -enable-kvm \
> -monitor stdio \
> -vnc 0:20 \
> -incoming tcp:0:4000 \
> 
> 2)Hmp cmds on src:
>   (qemu)device_add virtio-balloon-pci,id=ba   ---->discovery virtio balloon
> device via cmd "lspci" in guest.
>   (qemu)migrate -d tcp:$dest_ip:port
>   (qemu)info status      --->status:paused
> 
> 3) Result: 
>     Migration fail with error message:
>      qemu-kvm:Unknown savevm section or instance
> "0000:00:02.0/virtio-balloon 0"
>      qemu-kvm load of migration failed:invalid argument.
> 
> 
>  When i start dest guest with virtio balloon device, migration of hot plug
> virtio balloon device will success.
> 1)Qemu-cmds:
>   src-guest:
> /usr/libexec/qemu-kvm \
> -M pc \
> -name rhel7.3-4 \
> -machine pc,accel=kvm,usb=off,vmport=off \
> -cpu host \
> -m 4G \
> -smp 8,sockets=8,cores=1,threads=1 \
> -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
> -mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
> -drive
> file=/home/73test/img/se_test.qcow2,if=none,id=drive-system-disk0,
> format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
> -device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
> -vga qxl \
> -enable-kvm \
> -monitor stdio \
> -vnc 0:20 \
> 
>   dest-guest:
> /usr/libexec/qemu-kvm \
> -M pc \
> -name rhel7.3-4 \
> -machine pc,accel=kvm,usb=off,vmport=off \
> -cpu host \
> -m 4G \
> -smp 8,sockets=8,cores=1,threads=1 \
> -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \
> -mon chardev=qmp_id_catch_monitor,id=monitor,mode=readline -no-hpet \
> -drive
> file=/home/balloon/img/se_test.qcow2,if=none,id=drive-system-disk0,
> format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
> -device ide-drive,drive=drive-system-disk0,id=d0,bus=ide.0,unit=0 \
> -vga qxl \
> -enable-kvm \
> -monitor stdio \
> -vnc 0:20 \
> -incoming tcp:0:4000 \
> -virtio balloon 
>  
> 2)Hmp cmds on src:
>   (qemu)device_add virtio-balloon-pci,id=ba   ---->discovery virtio balloon
> device via cmd "lspci" in guest.
>   (qemu)migrate -d tcp:$dest_ip:port
>   (qemu)info status      --->status:paused
> 3) Result: 
>     Migration success.
> 
> Hot plug migration fail when starting dest guest without balloon device 
> But hot unplug migration success when starting dest guest with balloon device
> Any foundational difference between these two operations?


This is the correct behaviour.
Whenever you hotplug/hotunplug a device you should match it on the commandline of the destination.
In the 1st case you are hotplugging on the source but haven't added it on the destination qemu; when the source sends the data for the hotplugged device the destination correctly gives the error stating the device that's been received isn't found.

The opposite case; where a device is unplugged on the source but still present on the destination is also an incorrect test; however we don't have anything to detect it.

Remember: The destination hardware must always match the source; if you hot plug something on the source add it to the destination command line.


Note You need to log in before you can comment on or make changes to this bug.