Bug 1740972 - After migration completed, qemu crash on destination end with "qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 21 device: 1 cmask: ff wmask: c0 w1cmask:0"
Summary: After migration completed, qemu crash on destination end with "qemu-kvm: get_...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: All
OS: Linux
unspecified
high
Target Milestone: rc
: 8.0
Assignee: Laurent Vivier
QA Contact: Gu Nini
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-14 03:16 UTC by xianwang
Modified: 2019-11-06 07:18 UTC (History)
9 users (show)

Fixed In Version: qemu-kvm-4.1.0-4.module+el8.1.0+4020+16089f93.ppc64le
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-06 07:18:29 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:3723 0 None None None 2019-11-06 07:18:50 UTC

Description xianwang 2019-08-14 03:16:38 UTC
Description of problem:
Migrate a vm with virtio-balloon device from ALT-7.6 to RHEL8.1.0, after migration completed, qemu crash on destination end with error message:
(qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 21 device: 1 cmask: ff wmask: c0 w1cmask:0


Version-Release number of selected component (if applicable):
src host:
4.14.0-115.11.1.el7a.ppc64le
qemu-kvm-rhev-2.12.0-18.el7_6.7.ppc64le
SLOF-20171214-2.gitfa98132.el7.noarch

dst host:
4.18.0-129.el8.ppc64le
qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3.ppc64le
SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch

Guest:
4.14.0-115.11.1.el7a.ppc64le

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest on src end with virtio-balloon device:
/usr/libexec/qemu-kvm -nodefaults -machine pseries-rhel7.6.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xc -monitor stdio

2.Boot a guest on dst end incoming mode:
/usr/libexec/qemu-kvm -nodefaults -machine pseries-rhel7.6.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xc -monitor stdio -incoming tcp:0:5801

3.do migration on src end:
(qemu) migrate -d tcp:10.16.212.192:5801

Actual results:
migration status is "completed" on src end while qemu crash on destination end:
src:
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off late-block-activate: off 
Migration status: completed
total timds
downtime: 27 milliseconds
setup: 4 milliseconds
transferred ram: 4548 kbytes
throughput: 388.28 mbps
remaining ram: 0 kbytes
total ram: 524288 kbytes
duplicate: 131762 pages
skipped: 0 pages
normal: 846 pages
normal bytes: 3384 kbytes
dirty sync count: 3
page size: 4 kbytes

dst:
(qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 21 device: 1 cmask: ff wmask: c0 w1cmask:0
qemu-kvm: Failed to load PCIDevice:config
qemu-kvm: Failed to load virtio-balloon:virtio
qemu-kvm: error while loading state for instance 0x0 of device 'pci@800000020000000:0c.0/virtio-balloon'
qemu-kvm: load of migration failed: Invalid argument


Expected results:
migration completed and vm works well on destination end.

Additional info:

Comment 1 xianwang 2019-08-14 03:33:34 UTC
I. 
ALT-7.6 is only supported on powerpc, so this is ppc only issue.

II.
This issue is only hit on qemu4.0, it works well on qemu2.12, i.e, it works well when destination build is "qemu-kvm-2.12.0-83.module+el8.1.0+3852+0ba8aef0.ppc64le".

Comment 2 xianwang 2019-08-15 05:17:36 UTC
This issue does not exist on qemu-kvm-3.1.0-30.module+el8.0.1+3755+6782b0ed.ppc64le, so it is a qemu4.0 regression, the detail information build is as following:

src host:
4.14.0-115.11.1.el7a.ppc64le
qemu-kvm-rhev-2.12.0-18.el7_6.7.ppc64le
SLOF-20171214-2.gitfa98132.el7.noarch

dst host:
4.18.0-129.el8.ppc64le
qemu-kvm-3.1.0-30.module+el8.0.1+3755+6782b0ed.ppc64le
SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch

Comment 3 Laurent Vivier 2019-08-20 10:39:14 UTC
It seems fixed with qemu-kvm-4.1.0, could you retest ?

I'm going to bisect to have a better idea of what happened.

Comment 4 Laurent Vivier 2019-08-20 12:43:56 UTC
This is fixed by:

commit 2bbadb08ce272d65e1f78621002008b07d1e0f03
Author: Stefan Hajnoczi <stefanha>
Date:   Wed Jul 10 16:14:40 2019 +0200

    virtio-balloon: fix QEMU 4.0 config size migration incompatibility
    
    The virtio-balloon config size changed in QEMU 4.0 even for existing
    machine types.  Migration from QEMU 3.1 to 4.0 can fail in some
    circumstances with the following error:
    
      qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10 read: a1 device: 1 cmask: ff wmask: c0 w1cmask:0
    
    This happens because the virtio-balloon config size affects the VIRTIO
    Legacy I/O Memory PCI BAR size.
    
    Introduce a qdev property called "qemu-4-0-config-size" and enable it
    only for the QEMU 4.0 machine types.  This way <4.0 machine types use
    the old size, 4.0 uses the larger size, and >4.0 machine types use the
    appropriate size depending on enabled virtio-balloon features.
    
    Live migration to and from old QEMUs to QEMU 4.1 works again as long as
    a versioned machine type is specified (do not use just "pc"!).
    
    Originally-by: Wolfgang Bumiller <w.bumiller>
    Signed-off-by: Stefan Hajnoczi <stefanha>
    Message-Id: <20190710141440.27635-1-stefanha>
    Reviewed-by: Dr. David Alan Gilbert <dgilbert>
    Tested-by: Dr. David Alan Gilbert <dgilbert>
    Tested-by: Wolfgang Bumiller <w.bumiller>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

Comment 6 xianwang 2019-08-21 02:32:48 UTC
(In reply to Laurent Vivier from comment #3)
> It seems fixed with qemu-kvm-4.1.0, could you retest ?
> 
> I'm going to bisect to have a better idea of what happened.

I have tried this scenario on qemu4.1, it works well on qemu4.1
source:
4.14.0-115.12.1.el7a.ppc64le
qemu-kvm-rhev-2.12.0-18.el7_6.7.ppc64le

destination:
4.18.0-134.el8.ppc64le
qemu-kvm-4.1.0-4.module+el8.1.0+4020+16089f93.ppc64le

Comment 8 Qunfang Zhang 2019-09-19 05:36:52 UTC
I'm moving the bug to VERIFIED which indicates it's passed QE verification based on comment 6.

Comment 11 errata-xmlrpc 2019-11-06 07:18:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723


Note You need to log in before you can comment on or make changes to this bug.