Bug 1285474 - [ppc64le] VM migration fail on qemu-kvm error on 'spapr/htab'
Summary: [ppc64le] VM migration fail on qemu-kvm error on 'spapr/htab'
Status: CLOSED DUPLICATE of bug 1282833
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.2
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: ---
Assignee: David Gibson
QA Contact: Virtualization Bugs
Depends On: 1282833
Blocks: 1284775 1305498 RHEV4.0PPC RHV4.1PPC
TreeView+ depends on / blocked
Reported: 2015-11-25 16:47 UTC by Ilanit Stein
Modified: 2019-04-30 10:24 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2015-12-01 03:05:43 UTC
Target Upstream Version:

Attachments (Terms of Use)
vdsm log (716.71 KB, application/x-gzip)
2015-11-25 16:55 UTC, Ilanit Stein
no flags Details
engine log (108.31 KB, application/x-gzip)
2015-11-25 16:56 UTC, Ilanit Stein
no flags Details
qemu log (2.51 KB, text/plain)
2015-11-25 16:56 UTC, Ilanit Stein
no flags Details

Description Ilanit Stein 2015-11-25 16:47:27 UTC
Description of problem:
VM migration fail on:
2015-11-25T15:51:48.366843Z qemu-kvm: error while loading state for instance 0x0 of device 'spapr/htab'
2015-11-25T15:51:48.366995Z qemu-kvm: load of migration failed: Invalid argument

Version-Release number of selected component (if applicable):
engine - rhevm 3.6.0-20

libvirt 1.2.17-13.el7_2.2.ppc64le


kernel -  3.10.0-327.2.1.el7.ppc64le

How reproducible:
Occurred on one setup with 2 ppc hosts. Did not occur on a second setup, with other 2 ppc hosts, with same versions as above.

vdsm.log error:
Thread-1060::DEBUG::2015-11-25 11:05:33,016::migration::558::virt.vm::(stop) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::stopping migration monitor thread
Thread-1060::DEBUG::2015-11-25 11:05:33,016::migration::453::virt.vm::(stop) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::stopping migration downtime thread
Thread-1060::ERROR::2015-11-25 11:05:33,017::migration::208::virt.vm::(_recover) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::internal error: early end of file from monitor: possible problem:
2015-11-25T16:05:32.585940Z qemu-kvm: error while loading state for instance 0x0 of device 'spapr/htab'
2015-11-25T16:05:32.586070Z qemu-kvm: load of migration failed: Invalid argument
Thread-1061::DEBUG::2015-11-25 11:05:33,016::migration::450::virt.vm::(run) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::migration downtime thread exiting
Thread-1060::DEBUG::2015-11-25 11:05:33,017::stompreactor::389::jsonrpc.AsyncoreClient::(send) Sending response
Thread-1060::DEBUG::2015-11-25 11:05:33,064::__init__::206::jsonrpc.Notification::(emit) Sending event {"params": {"notify_time": 42978047030, "3ad53ed2-6bf9-4494-9db5-d7adb7854256": {"status": "Migration Source"}}, "jsonrpc": "2.0", "method": "|virt|VM_status|3ad53ed2-6bf9-4494-9db5-d7adb7854256"}
Thread-1060::ERROR::2015-11-25 11:05:33,065::migration::310::virt.vm::(run) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 294, in run
  File "/usr/share/vdsm/virt/migration.py", line 364, in _startUnderlyingMigration
    self._perform_migration(duri, muri)
  File "/usr/share/vdsm/virt/migration.py", line 403, in _perform_migration
    self._vm._dom.migrateToURI3(duri, params, flags)
  File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1836, in migrateToURI3
    if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
libvirtError: internal error: early end of file from monitor: possible problem:
2015-11-25T16:05:32.585940Z qemu-kvm: error while loading state for instance 0x0 of device 'spapr/htab'
2015-11-25T16:05:32.586070Z qemu-kvm: load of migration failed: Invalid argument

Comment 1 Ilanit Stein 2015-11-25 16:55:49 UTC
Created attachment 1098917 [details]
vdsm log

Comment 2 Ilanit Stein 2015-11-25 16:56:15 UTC
Created attachment 1098918 [details]
engine log

Comment 3 Ilanit Stein 2015-11-25 16:56:50 UTC
Created attachment 1098919 [details]
qemu log

Comment 4 Ilanit Stein 2015-11-26 11:29:44 UTC
Checked VM migration again with disabling memory hot plug, restart engine, and VM restart (power off & run again), and migration was successful,

Same as it was on bug 1282833, in this bug as well the memory hot plug was the root cause.

Disabled memory hot plug by:
engine=# insert into vdc_options (option_name, option_value, version)  VALUES ('HotPlugMemorySupported', '{"x86_64":"true","ppc64":"false"}' ,'3.6');
engine=# select * from vdc_options where option_name ='HotPlugMemorySupported';
 option_id |      option_name       |            option_value            | version 
       178 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.0
       179 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.1
       180 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.2
       181 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.3
       182 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.4
       183 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.5
       840 | HotPlugMemorySupported | {"x86_64":"true","ppc64":"false"}  | 3.6
(7 rows)

Comment 5 Michal Skrivanek 2015-11-26 11:31:52 UTC
this seems to be related to memory hotplug, the 1TB maxmem size we use for all VMs

Comment 6 Qunfang Zhang 2015-11-27 08:07:04 UTC
Reproduced the issue on the following version:

Host A (256G mem):


Host B (128G mem):


# /usr/libexec/qemu-kvm -name test -machine pseries,accel=kvm,usb=off -m 4G,slots=4,maxmem=1024G -smp 4,sockets=1,cores=4,threads=1 -uuid 8aeab7e2-f341-4f8c-80e8-59e2968d85c2 -realtime mlock=off -nodefaults -monitor stdio -rtc base=utc -device virtio-scsi-pci,id=scsi0,bus=pci.0 -drive file=RHEL-7.2-20151015.0-Server-ppc64le.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,bootindex=1,id=scsi0-0-0-0  -drive if=none,id=drive-scsi0-0-1-0,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,drive=drive-scsi0-0-1-0,bootindex=2,id=scsi0-0-1-0 -vnc :10 -msg timestamp=on -usb -device usb-tablet,id=tablet1  -vga std -qmp tcp:0:4666,server,nowait -netdev tap,id=hostnet1,script=/etc/qemu-ifup,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:54:5a:5f:5b:5c -incoming tcp:0:5800
QEMU 2.3.0 monitor - type 'help' for more information
(qemu) 2015-11-27T08:05:15.985197Z qemu-kvm: error while loading state for instance 0x0 of device 'spapr/htab'
2015-11-27T08:05:15.985265Z qemu-kvm: load of migration failed: Invalid argument

After change maxmem from 1024G to 512G, this issue does not happen.

Comment 7 David Gibson 2015-12-01 03:05:43 UTC
This is essentially the same problem as bug 1282833 - the destination host cannot allocate a hash page table the same size as the guest had on the source host.  

Although it's allocated outside the guest, the hash page table size is visible to the guest, and so there's no way to migrate if it has a different size on source and destination.

*** This bug has been marked as a duplicate of bug 1282833 ***

Note You need to log in before you can comment on or make changes to this bug.