Description of problem: Migrate a vm with virtio-balloon device from ALT-7.6 to RHEL8.1.0, after migration completed, qemu crash on destination end with error message: (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 21 device: 1 cmask: ff wmask: c0 w1cmask:0 Version-Release number of selected component (if applicable): src host: 4.14.0-115.11.1.el7a.ppc64le qemu-kvm-rhev-2.12.0-18.el7_6.7.ppc64le SLOF-20171214-2.gitfa98132.el7.noarch dst host: 4.18.0-129.el8.ppc64le qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3.ppc64le SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch Guest: 4.14.0-115.11.1.el7a.ppc64le How reproducible: 100% Steps to Reproduce: 1.Boot a guest on src end with virtio-balloon device: /usr/libexec/qemu-kvm -nodefaults -machine pseries-rhel7.6.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xc -monitor stdio 2.Boot a guest on dst end incoming mode: /usr/libexec/qemu-kvm -nodefaults -machine pseries-rhel7.6.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xc -monitor stdio -incoming tcp:0:5801 3.do migration on src end: (qemu) migrate -d tcp:10.16.212.192:5801 Actual results: migration status is "completed" on src end while qemu crash on destination end: src: (qemu) info migrate globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off late-block-activate: off Migration status: completed total timds downtime: 27 milliseconds setup: 4 milliseconds transferred ram: 4548 kbytes throughput: 388.28 mbps remaining ram: 0 kbytes total ram: 524288 kbytes duplicate: 131762 pages skipped: 0 pages normal: 846 pages normal bytes: 3384 kbytes dirty sync count: 3 page size: 4 kbytes dst: (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 21 device: 1 cmask: ff wmask: c0 w1cmask:0 qemu-kvm: Failed to load PCIDevice:config qemu-kvm: Failed to load virtio-balloon:virtio qemu-kvm: error while loading state for instance 0x0 of device 'pci@800000020000000:0c.0/virtio-balloon' qemu-kvm: load of migration failed: Invalid argument Expected results: migration completed and vm works well on destination end. Additional info:
I. ALT-7.6 is only supported on powerpc, so this is ppc only issue. II. This issue is only hit on qemu4.0, it works well on qemu2.12, i.e, it works well when destination build is "qemu-kvm-2.12.0-83.module+el8.1.0+3852+0ba8aef0.ppc64le".
This issue does not exist on qemu-kvm-3.1.0-30.module+el8.0.1+3755+6782b0ed.ppc64le, so it is a qemu4.0 regression, the detail information build is as following: src host: 4.14.0-115.11.1.el7a.ppc64le qemu-kvm-rhev-2.12.0-18.el7_6.7.ppc64le SLOF-20171214-2.gitfa98132.el7.noarch dst host: 4.18.0-129.el8.ppc64le qemu-kvm-3.1.0-30.module+el8.0.1+3755+6782b0ed.ppc64le SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch
It seems fixed with qemu-kvm-4.1.0, could you retest ? I'm going to bisect to have a better idea of what happened.
This is fixed by: commit 2bbadb08ce272d65e1f78621002008b07d1e0f03 Author: Stefan Hajnoczi <stefanha> Date: Wed Jul 10 16:14:40 2019 +0200 virtio-balloon: fix QEMU 4.0 config size migration incompatibility The virtio-balloon config size changed in QEMU 4.0 even for existing machine types. Migration from QEMU 3.1 to 4.0 can fail in some circumstances with the following error: qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10 read: a1 device: 1 cmask: ff wmask: c0 w1cmask:0 This happens because the virtio-balloon config size affects the VIRTIO Legacy I/O Memory PCI BAR size. Introduce a qdev property called "qemu-4-0-config-size" and enable it only for the QEMU 4.0 machine types. This way <4.0 machine types use the old size, 4.0 uses the larger size, and >4.0 machine types use the appropriate size depending on enabled virtio-balloon features. Live migration to and from old QEMUs to QEMU 4.1 works again as long as a versioned machine type is specified (do not use just "pc"!). Originally-by: Wolfgang Bumiller <w.bumiller> Signed-off-by: Stefan Hajnoczi <stefanha> Message-Id: <20190710141440.27635-1-stefanha> Reviewed-by: Dr. David Alan Gilbert <dgilbert> Tested-by: Dr. David Alan Gilbert <dgilbert> Tested-by: Wolfgang Bumiller <w.bumiller> Reviewed-by: Michael S. Tsirkin <mst> Signed-off-by: Michael S. Tsirkin <mst>
(In reply to Laurent Vivier from comment #3) > It seems fixed with qemu-kvm-4.1.0, could you retest ? > > I'm going to bisect to have a better idea of what happened. I have tried this scenario on qemu4.1, it works well on qemu4.1 source: 4.14.0-115.12.1.el7a.ppc64le qemu-kvm-rhev-2.12.0-18.el7_6.7.ppc64le destination: 4.18.0-134.el8.ppc64le qemu-kvm-4.1.0-4.module+el8.1.0+4020+16089f93.ppc64le
I'm moving the bug to VERIFIED which indicates it's passed QE verification based on comment 6.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3723