Bug 1709726

Summary:	Forward and backward migration failed with "qemu-kvm: error while loading state for instance 0x0 of device 'spapr'"
Product:	Red Hat Enterprise Linux Advanced Virtualization	Reporter:	xianwang <xianwang>
Component:	qemu-kvm	Assignee:	Laurent Vivier <lvivier>
Status:	CLOSED ERRATA	QA Contact:	xianwang <xianwang>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	8.1	CC:	ddepaula, dgibson, lvivier, ngu, qzhang, virt-maint
Target Milestone:	rc
Target Release:	8.0
Hardware:	ppc64le
OS:	Linux
Whiteboard:
Fixed In Version:	qemu-kvm-4.0.0-3.module+el8.1.0+3265+26c4ed71	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1744170 (view as bug list)		Environment:
Last Closed:	2019-11-06 07:14:47 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1744170

Description xianwang 2019-05-14 08:15:49 UTC

Description of problem:
When do backward migration from qemu 4.0.0 to qemu 3.1.0, migration failed with "(qemu) qemu-kvm: error while loading state for instance 0x0 of device 'spapr'", it happens on p8<->p8,p9<->p9 and p8<->p9.

Version-Release number of selected component (if applicable):
Host A p9(rhel8.0.1):
4.18.0-80.el8.ppc64le
qemu-kvm-3.1.0-25.module+el8.0.1+3188+cd3de524.ppc64le

Host B p9(rhel8.1.0 fast train):
4.18.0-85.el8.ppc64le
qemu-kvm-4.0.0-0.module+el8.1.0+3169+3c501422

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest on host A:
/usr/libexec/qemu-kvm -M pseries-rhel7.6.0 -nodefaults -monitor stdio

2.Boot incoming guest on host B:
/usr/libexec/qemu-kvm -M pseries-rhel7.6.0 -nodefaults -monitor stdio -incoming tcp:0:5801

3.Do forward migration from host A to host B
(qemu) migrate -d tcp:10.19.128.163:5801
migration completed and vm running on hostB.

4.Boot incoming guest on host A:
/usr/libexec/qemu-kvm -M pseries-rhel7.6.0 -nodefaults -monitor stdio -incoming tcp:0:5801

5.Do backward migration from host B to host A
(qemu) migrate -d tcp:10.16.212.186:5801


Actual results:
Migration completed on host B, but qemu crash on host A
on host A:
(qemu) qemu-kvm: error while loading state for instance 0x0 of device 'spapr'
qemu-kvm: load of migration failed: No such file or directory


Expected results:
Forward and backward migration both work well.

Additional info:
I)This issue happens both for following build:
p8(rhel8.0.0):
4.18.0-80.2.1.el8_0.ppc64le
qemu-kvm-2.12.0-64.module+el8.0.0+3180+d6a3561d.2.ppc64le

p9(rhel8.1.0):
4.18.0-85.el8.ppc64le
qemu-kvm-4.0.0-0.module+el8.1.0+3169+3c501422

Comment 1 xianwang 2019-05-14 08:30:42 UTC

If boot a guest with os in qemu cli, do migration from rhel8.1.0 to rhel8.0.0, migration status is failed and vm running well on src host, but qemu crash on destination host as following:

p9(rhel8.1.0):
4.18.0-85.el8.ppc64le
qemu-kvm-4.0.0-0.module+el8.1.0+3169+3c501422

p8(rhel8.0.0):
4.18.0-80.2.1.el8_0.ppc64le
qemu-kvm-2.12.0-64.module+el8.0.0+3180+d6a3561d.2.ppc64le

qemu cli:
/usr/libexec/qemu-kvm \
-M pseries-rhel7.6.0,max-cpu-compat=power8 \
-nodefaults \
-monitor stdio \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x08 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/xianwang/rhel800-ppc64le-virtio-scsi.qcow2,node-name=drive_sys1 \
-blockdev driver=qcow2,node-name=drive_image1,file=drive_sys1 \
-device scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,channel=0,scsi-id=0,lun=0,bootindex=0 \
-vnc :1 \
-vga std \
-incoming tcp:0:5801 \

after migration, on source host:
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off return-path: off pause-before-switchover: off multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off x-ignore-shared: off 
Migration status: failed
total time: 0 milliseconds
(qemu) info status 
VM status: running

on destination host:
(qemu) qemu-kvm: error while loading state for instance 0x0 of device 'spapr'
qemu-kvm: load of migration failed: No such file or directory

Comment 2 Laurent Vivier 2019-05-14 08:43:10 UTC

(In reply to xianwang from comment #1)
> If boot a guest with os in qemu cli, do migration from rhel8.1.0 to
> rhel8.0.0, migration status is failed and vm running well on src host, but
> qemu crash on destination host as following:
> 
> p9(rhel8.1.0):
> 4.18.0-85.el8.ppc64le
> qemu-kvm-4.0.0-0.module+el8.1.0+3169+3c501422
> 
> p8(rhel8.0.0):
> 4.18.0-80.2.1.el8_0.ppc64le
> qemu-kvm-2.12.0-64.module+el8.0.0+3180+d6a3561d.2.ppc64le
> 
> qemu cli:
> /usr/libexec/qemu-kvm \
> -M pseries-rhel7.6.0,max-cpu-compat=power8 \
> -nodefaults \
> -monitor stdio \
> -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x08 \
> -blockdev
> driver=file,cache.direct=on,cache.no-flush=off,filename=/home/xianwang/
> rhel800-ppc64le-virtio-scsi.qcow2,node-name=drive_sys1 \
> -blockdev driver=qcow2,node-name=drive_image1,file=drive_sys1 \
> -device
> scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,channel=0,scsi-
> id=0,lun=0,bootindex=0 \
> -vnc :1 \
> -vga std \
> -incoming tcp:0:5801 \
> 
> after migration, on source host:
> (qemu) info migrate
> globals:
> store-global-state: on
> only-migratable: off
> send-configuration: on
> send-section-footer: on
> decompress-error-check: on
> capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks:
> off compress: off events: off postcopy-ram: off x-colo: off release-ram: off
> return-path: off pause-before-switchover: off multifd: off dirty-bitmaps:
> off postcopy-blocktime: off late-block-activate: off x-ignore-shared: off 
> Migration status: failed
> total time: 0 milliseconds
> (qemu) info status 
> VM status: running
> 
> on destination host:
> (qemu) qemu-kvm: error while loading state for instance 0x0 of device 'spapr'
> qemu-kvm: load of migration failed: No such file or directory


Could you try the same migration between two P8 hosts?

Comment 3 xianwang 2019-05-15 05:35:45 UTC

(In reply to Laurent Vivier from comment #2)
> (In reply to xianwang from comment #1)
> > If boot a guest with os in qemu cli, do migration from rhel8.1.0 to
> > rhel8.0.0, migration status is failed and vm running well on src host, but
> > qemu crash on destination host as following:
> > 
> > p9(rhel8.1.0):
> > 4.18.0-85.el8.ppc64le
> > qemu-kvm-4.0.0-0.module+el8.1.0+3169+3c501422
> > 
> > p8(rhel8.0.0):
> > 4.18.0-80.2.1.el8_0.ppc64le
> > qemu-kvm-2.12.0-64.module+el8.0.0+3180+d6a3561d.2.ppc64le
> > 
> > qemu cli:
> > /usr/libexec/qemu-kvm \
> > -M pseries-rhel7.6.0,max-cpu-compat=power8 \
> > -nodefaults \
> > -monitor stdio \
> > -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x08 \
> > -blockdev
> > driver=file,cache.direct=on,cache.no-flush=off,filename=/home/xianwang/
> > rhel800-ppc64le-virtio-scsi.qcow2,node-name=drive_sys1 \
> > -blockdev driver=qcow2,node-name=drive_image1,file=drive_sys1 \
> > -device
> > scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,channel=0,scsi-
> > id=0,lun=0,bootindex=0 \
> > -vnc :1 \
> > -vga std \
> > -incoming tcp:0:5801 \
> > 
> > after migration, on source host:
> > (qemu) info migrate
> > globals:
> > store-global-state: on
> > only-migratable: off
> > send-configuration: on
> > send-section-footer: on
> > decompress-error-check: on
> > capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks:
> > off compress: off events: off postcopy-ram: off x-colo: off release-ram: off
> > return-path: off pause-before-switchover: off multifd: off dirty-bitmaps:
> > off postcopy-blocktime: off late-block-activate: off x-ignore-shared: off 
> > Migration status: failed
> > total time: 0 milliseconds
> > (qemu) info status 
> > VM status: running
> > 
> > on destination host:
> > (qemu) qemu-kvm: error while loading state for instance 0x0 of device 'spapr'
> > qemu-kvm: load of migration failed: No such file or directory
> 
> 
> Could you try the same migration between two P8 hosts?

I have tried the same migration between two p8 hosts, the result is same with above
Host A:
4.18.0-80.2.1.el8_0.ppc64le
qemu-kvm-2.12.0-64.module+el8.0.0+3180+d6a3561d.2.ppc64le

Host B:
4.18.0-85.el8.ppc64le
qemu-kvm-4.0.0-0.module+el8.1.0+3169+3c501422.ppc64le

the steps and result are same with above.
rhel8.0.0-->rhel8.1.0 works well, but rhel8.1.0-->rhel8.0.0 failed.

Comment 4 Laurent Vivier 2019-05-15 13:31:48 UTC

I think this problem happens because the machine type for rhel-av-8.1.0 has not been implemented in this preview build, so the new features added for qemu-4.0.0 are not disabled for the pseries-rhel7.6.0 machine type migration on the 8.1.0 side.

Comment 8 Danilo de Paula 2019-05-28 23:42:20 UTC

Fix included in qemu-kvm-4.0.0-3.module+el8.1.0+3265+26c4ed71

Comment 10 xianwang 2019-05-29 07:37:40 UTC

Bug verification:
Host:
P8:
4.18.0-94.el8.ppc64le
qemu-kvm-4.0.0-3.module+el8.1.0+3265+26c4ed71.ppc64le
SLOF-20171214-5.gitfa98132.module+el8.1.0+2983+b2ae9c0a.noarch

P9:
4.18.0-80.1.2.el8_0.ppc64le
qemu-kvm-2.12.0-64.module+el8.0.0+3180+d6a3561d.2.ppc64le
SLOF-20171214-5.gitfa98132.module+el8+2616+396d822d.noarch

steps are same with comment 1.
result:
migration completed and vm works well after migration.

Comment 12 errata-xmlrpc 2019-11-06 07:14:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723