RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1285474 - [ppc64le] VM migration fail on qemu-kvm error on 'spapr/htab'
Summary: [ppc64le] VM migration fail on qemu-kvm error on 'spapr/htab'
Keywords:
Status: CLOSED DUPLICATE of bug 1282833
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: David Gibson
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1282833
Blocks: 1284775 1305498 RHEV4.0PPC RHV4.1PPC
TreeView+ depends on / blocked
 
Reported: 2015-11-25 16:47 UTC by Ilanit Stein
Modified: 2019-04-30 10:24 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-01 03:05:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vdsm log (716.71 KB, application/x-gzip)
2015-11-25 16:55 UTC, Ilanit Stein
no flags Details
engine log (108.31 KB, application/x-gzip)
2015-11-25 16:56 UTC, Ilanit Stein
no flags Details
qemu log (2.51 KB, text/plain)
2015-11-25 16:56 UTC, Ilanit Stein
no flags Details

Description Ilanit Stein 2015-11-25 16:47:27 UTC
Description of problem:
VM migration fail on:
2015-11-25T15:51:48.366843Z qemu-kvm: error while loading state for instance 0x0 of device 'spapr/htab'
2015-11-25T15:51:48.366995Z qemu-kvm: load of migration failed: Invalid argument

Version-Release number of selected component (if applicable):
engine - rhevm 3.6.0-20

host-
vdsm 4.17.10.1-0.el7ev
libvirt 1.2.17-13.el7_2.2.ppc64le

qemu-img-rhev-2.3.0-31.el7_2.3.ppc64le
qemu-kvm-common-rhev-2.3.0-31.el7_2.3.ppc64le
ipxe-roms-qemu-20130517-7.gitc4bce43.el7.noarch
libvirt-daemon-driver-qemu-1.2.17-13.el7_2.2.ppc64le
qemu-kvm-tools-rhev-2.3.0-31.el7_2.3.ppc64le
qemu-kvm-rhev-2.3.0-31.el7_2.3.ppc64le

kernel -  3.10.0-327.2.1.el7.ppc64le

How reproducible:
Occurred on one setup with 2 ppc hosts. Did not occur on a second setup, with other 2 ppc hosts, with same versions as above.

vdsm.log error:
Thread-1060::DEBUG::2015-11-25 11:05:33,016::migration::558::virt.vm::(stop) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::stopping migration monitor thread
Thread-1060::DEBUG::2015-11-25 11:05:33,016::migration::453::virt.vm::(stop) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::stopping migration downtime thread
Thread-1060::ERROR::2015-11-25 11:05:33,017::migration::208::virt.vm::(_recover) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::internal error: early end of file from monitor: possible problem:
2015-11-25T16:05:32.585940Z qemu-kvm: error while loading state for instance 0x0 of device 'spapr/htab'
2015-11-25T16:05:32.586070Z qemu-kvm: load of migration failed: Invalid argument
Thread-1061::DEBUG::2015-11-25 11:05:33,016::migration::450::virt.vm::(run) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::migration downtime thread exiting
Thread-1060::DEBUG::2015-11-25 11:05:33,017::stompreactor::389::jsonrpc.AsyncoreClient::(send) Sending response
Thread-1060::DEBUG::2015-11-25 11:05:33,064::__init__::206::jsonrpc.Notification::(emit) Sending event {"params": {"notify_time": 42978047030, "3ad53ed2-6bf9-4494-9db5-d7adb7854256": {"status": "Migration Source"}}, "jsonrpc": "2.0", "method": "|virt|VM_status|3ad53ed2-6bf9-4494-9db5-d7adb7854256"}
Thread-1060::ERROR::2015-11-25 11:05:33,065::migration::310::virt.vm::(run) vmId=`3ad53ed2-6bf9-4494-9db5-d7adb7854256`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 294, in run
    self._startUnderlyingMigration(time.time())
  File "/usr/share/vdsm/virt/migration.py", line 364, in _startUnderlyingMigration
    self._perform_migration(duri, muri)
  File "/usr/share/vdsm/virt/migration.py", line 403, in _perform_migration
    self._vm._dom.migrateToURI3(duri, params, flags)
  File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1836, in migrateToURI3
    if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
libvirtError: internal error: early end of file from monitor: possible problem:
2015-11-25T16:05:32.585940Z qemu-kvm: error while loading state for instance 0x0 of device 'spapr/htab'
2015-11-25T16:05:32.586070Z qemu-kvm: load of migration failed: Invalid argument

Comment 1 Ilanit Stein 2015-11-25 16:55:49 UTC
Created attachment 1098917 [details]
vdsm log

Comment 2 Ilanit Stein 2015-11-25 16:56:15 UTC
Created attachment 1098918 [details]
engine log

Comment 3 Ilanit Stein 2015-11-25 16:56:50 UTC
Created attachment 1098919 [details]
qemu log

Comment 4 Ilanit Stein 2015-11-26 11:29:44 UTC
Checked VM migration again with disabling memory hot plug, restart engine, and VM restart (power off & run again), and migration was successful,

Same as it was on bug 1282833, in this bug as well the memory hot plug was the root cause.

Disabled memory hot plug by:
engine=# insert into vdc_options (option_name, option_value, version)  VALUES ('HotPlugMemorySupported', '{"x86_64":"true","ppc64":"false"}' ,'3.6');
INSERT 0 1
engine=# select * from vdc_options where option_name ='HotPlugMemorySupported';
 option_id |      option_name       |            option_value            | version 
-----------+------------------------+------------------------------------+---------
       178 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.0
       179 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.1
       180 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.2
       181 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.3
       182 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.4
       183 | HotPlugMemorySupported | {"x86_64":"false","ppc64":"false"} | 3.5
       840 | HotPlugMemorySupported | {"x86_64":"true","ppc64":"false"}  | 3.6
(7 rows)

Comment 5 Michal Skrivanek 2015-11-26 11:31:52 UTC
this seems to be related to memory hotplug, the 1TB maxmem size we use for all VMs

Comment 6 Qunfang Zhang 2015-11-27 08:07:04 UTC
Reproduced the issue on the following version:

Host A (256G mem):

kernel-3.10.0-327.4.1.el7.ppc64le
qemu-kvm-rhev-2.3.0-31.el7_2.3.ppc64le

Host B (128G mem):

kernel-3.10.0-327.2.1.el7.ppc64le
qemu-kvm-rhev-2.3.0-31.el7_2.3.ppc64le


# /usr/libexec/qemu-kvm -name test -machine pseries,accel=kvm,usb=off -m 4G,slots=4,maxmem=1024G -smp 4,sockets=1,cores=4,threads=1 -uuid 8aeab7e2-f341-4f8c-80e8-59e2968d85c2 -realtime mlock=off -nodefaults -monitor stdio -rtc base=utc -device virtio-scsi-pci,id=scsi0,bus=pci.0 -drive file=RHEL-7.2-20151015.0-Server-ppc64le.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,bootindex=1,id=scsi0-0-0-0  -drive if=none,id=drive-scsi0-0-1-0,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,drive=drive-scsi0-0-1-0,bootindex=2,id=scsi0-0-1-0 -vnc :10 -msg timestamp=on -usb -device usb-tablet,id=tablet1  -vga std -qmp tcp:0:4666,server,nowait -netdev tap,id=hostnet1,script=/etc/qemu-ifup,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:54:5a:5f:5b:5c -incoming tcp:0:5800
QEMU 2.3.0 monitor - type 'help' for more information
(qemu) 
(qemu) 2015-11-27T08:05:15.985197Z qemu-kvm: error while loading state for instance 0x0 of device 'spapr/htab'
2015-11-27T08:05:15.985265Z qemu-kvm: load of migration failed: Invalid argument


After change maxmem from 1024G to 512G, this issue does not happen.

Comment 7 David Gibson 2015-12-01 03:05:43 UTC
This is essentially the same problem as bug 1282833 - the destination host cannot allocate a hash page table the same size as the guest had on the source host.  

Although it's allocated outside the guest, the hash page table size is visible to the guest, and so there's no way to migrate if it has a different size on source and destination.

*** This bug has been marked as a duplicate of bug 1282833 ***


Note You need to log in before you can comment on or make changes to this bug.