Bug 1730566 - 7 -> 8 migration failed: Missing section footer for 0000:00:1f.3/ich9_smb/Missing section footer for 0000:00:01.3/piix4_pm
Summary: 7 -> 8 migration failed: Missing section footer for 0000:00:1f.3/ich9_smb/Mis...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: jingzhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-17 07:13 UTC by jingzhao
Modified: 2019-11-06 07:18 UTC (History)
6 users (show)

Fixed In Version: qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-06 07:17:49 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:3723 None None None 2019-11-06 07:18:16 UTC

Description jingzhao 2019-07-17 07:13:26 UTC
Description of problem:
Migration failed when migrate from 7.7 to 8.1

(qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x60 read: 11 device: 4 cmask: ff wmask: 0 w1cmask:0
qemu-kvm: Failed to load PCIDevice:config
qemu-kvm: Failed to load pcie-root-port:parent_obj.parent_obj.parent_obj
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:04.0/pcie-root-port'
qemu-kvm: load of migration failed: Invalid argument



Version-Release number of selected component (if applicable):
source host version:
# rpm -qa |grep qemu-kvm-rhev
qemu-kvm-rhev-2.12.0-33.el7.x86_64
# uname -r
3.10.0-1061.el7.x86_64

destination host version:
qemu-kvm-4.0.0-5.module+el8.1.0+3622+5812d9bf.x86_64
# uname -r
4.18.0-116.el8.x86_64

guest: rhel801-64-virtio-scsi.qcow2

How reproducible:
3/3

Steps to Reproduce:
1. Boot guest with qemu command line[1]
2. Migrate from rhel7.7 to rhel8.1
3. source qemu
(qemu) info status
VM status: paused (postmigrate)



Actual results:
Migrate failed 
(qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x60 read: 11 device: 4 cmask: ff wmask: 0 w1cmask:0
qemu-kvm: Failed to load PCIDevice:config
qemu-kvm: Failed to load pcie-root-port:parent_obj.parent_obj.parent_obj
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:04.0/pcie-root-port'
qemu-kvm: load of migration failed: Invalid argument


Expected results:
migrate successfully


Additional info:
[1]
/usr/libexec/qemu-kvm  \
-rtc base=utc,clock=host \
-qmp tcp:0:3333,server,nowait \
-enable-kvm  \
-watchdog i6300esb \
-serial tcp:0:4444,server,nowait \
-monitor stdio \
-boot order=cdn,once=c,menu=on,strict=on \
-vga qxl \
-smp 8,cores=1,threads=1,sockets=8 \
-machine pc-q35-rhel7.6.0 \
-vnc :10 \
-watchdog-action reset \
-object memory-backend-ram,policy=bind,id=mem-1,size=2048M,prealloc=yes,host-nodes=0 -numa node,memdev=mem-1 \
-object memory-backend-ram,policy=bind,id=mem-2,size=2048M,prealloc=yes,host-nodes=0 -numa node,memdev=mem-2 \
-device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \
-device ahci,id=ahci0,bus=pcie.0,addr=0x3 \
-device ide-cd,bus=ahci0.0,unit=0,drive=drive-ide0-1-1,id=ide0-1-1 \
-device pcie-root-port,id=pcie-root-port0,bus=pcie.0,addr=0x4 \
-device virtio-scsi-pci,id=scsi0,bus=pcie-root-port0 \
-device pcie-root-port,id=pcie-root-port1,bus=pcie.0,addr=0x5,chassis=1 \
-device scsi-hd,bus=scsi0.0,lun=0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,lun=1,id=data-disk1 \
-device pcie-root-port,id=pcie-root-port15,bus=pcie.0,addr=0x6,chassis=2 \
-device virtio-serial-pci,id=virtio-serial0,bus=pcie-root-port15 \
-device isa-serial,chardev=charserial0,id=serial0 \
-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 \
-device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 \
-device intel-hda,id=sound0,bus=pcie.0,addr=0x7 \
-device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \
-device intel-hda,id=sound1,bus=pcie.0,addr=0x8 \
-device hda-micro,id=sound1-codec0,bus=sound1.0 \
-device intel-hda,id=sound2,bus=pcie.0,addr=0x9 \
-device hda-output,id=sound2-codec0,bus=sound2.0,cad=0 \
-device ich9-intel-hda,id=sound3,bus=pcie.0,addr=0xa \
-device hda-duplex,id=sound3-codec0,bus=sound3.0,cad=0 \
-device pvpanic,ioport=1285 \
-device pcie-root-port,id=pcie-root-port2,bus=pcie.0,addr=0xb,chassis=3 \
-device e1000e,netdev=hostnet1,id=virtio-net-pci1,mac=00:52:68:26:31:03,bus=pcie-root-port2 \
-device pcie-root-port,id=pcie-root-port3,bus=pcie.0,addr=0xc,chassis=4 \
-device virtio-net-pci,netdev=hostnet2,id=virtio-net-pci2,mac=00:52:68:26:31:04,bus=pcie-root-port3 \
-device ide-hd,drive=drive-data-disk,id=system-disk,logical_block_size=512,physical_block_size=512,min_io_size=32,opt_io_size=64,discard_granularity=512,ver=fuxc-ver,bus=ide.0,unit=0 \
-device pcie-root-port,id=pcie-root-port4,bus=pcie.0,addr=0xd,chassis=5 \
-device ich9-usb-uhci6,id=uhci6,bus=pcie-root-port4 \
-device usb-kbd,id=kdb0,bus=uhci6.0 \
-device pcie-root-port,id=pcie-root-port5,bus=pcie.0,addr=0xe,chassis=6 \
-device ich9-usb-uhci5,id=uhci5,bus=pcie-root-port5 \
-device usb-mouse,id=mouse0,bus=uhci5.0 \
-device pcie-root-port,id=pcie-root-port6,bus=pcie.0,addr=0xf,chassis=7 \
-device nec-usb-xhci,id=xhci,bus=pcie-root-port5 \
-device usb-bot,id=bot1,bus=xhci.0 \
-device scsi-hd,bus=bot1.0,scsi-id=0,lun=0,drive=usb-disk1,id=usb-disk1 \
-device pcie-root-port,id=pcie-root-port7,bus=pcie.0,addr=0x10,chassis=8 \
-device usb-ehci,id=ehci,bus=pcie-root-port7 \
-device usb-bot,id=bot2,bus=ehci.0 \
-device scsi-hd,bus=bot2.0,scsi-id=0,lun=0,drive=usb-disk2,id=usb-disk2 \
-device pcie-root-port,id=pcie-root-port8,bus=pcie.0,addr=0x11,chassis=9 \
-device piix3-usb-uhci,id=usb,bus=pcie-root-port8 \
-device usb-bot,id=bot3,bus=usb.0 \
-device scsi-hd,bus=bot3.0,scsi-id=0,lun=0,drive=usb-disk3,id=usb-disk3 \
-device pcie-root-port,id=pcie-root-port9,bus=pcie.0,addr=0x12,chassis=10 \
-device ich9-usb-uhci3,id=uhci,bus=pcie-root-port9 \
-device usb-storage,drive=drive-usb-0,id=usb-0,removable=on,bus=uhci.0,port=1 \
-device pcie-root-port,id=pcie-root-port10,bus=pcie.0,addr=0x13.0,multifunction=on,chassis=11 \
-device pcie-root-port,id=pcie-root-port11,bus=pcie.0,addr=0x13.1,chassis=12 \
-device ich9-usb-ehci1,id=ehci1,bus=pcie-root-port10 \
-device usb-storage,drive=drive-usb-1,id=usb-1,removable=on,bus=ehci1.0,port=1 \
-device nec-usb-xhci,id=xhci1,bus=pcie-root-port11 \
-device usb-storage,drive=drive-usb-2,id=usb-2,removable=on,bus=xhci1.0,port=1 \
-device pcie-root-port,id=pcie-root-port12,bus=pcie.0,addr=0x14,chassis=13 \
-device virtio-rng-pci,id=rng0,bus=pcie-root-port12 \
-device pcie-root-port,id=pcie-root-port13,bus=pcie.0,addr=0x15,chassis=14 \
-device virtio-balloon-pci,id=balloon0,bus=pcie-root-port13 \
-device isa-debugcon,chardev=seabioslog_id,iobase=0x402 \
-nodefaults  \
-name "mouse-vm" \
-netdev tap,id=hostnet1,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \
-netdev tap,id=hostnet2,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \
-m 4096,slots=256,maxmem=32G \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/a1.iso,node-name=drive_sys1 \
-blockdev driver=raw,node-name=drive-ide0-1-0,file=drive_sys1 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/a2.iso,node-name=drive_sys2 \
-blockdev driver=raw,node-name=drive-ide0-1-1,file=drive_sys2 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/rhel801-64-virtio-scsi.qcow2,node-name=drive_sys3 \
-blockdev driver=qcow2,node-name=drive-virtio-disk0,file=drive_sys3 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/d2.qcow2,node-name=drive_sys4 \
-blockdev driver=qcow2,node-name=drive-scsi-disk,file=drive_sys4 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/d3.raw,node-name=drive_sys5 \
-blockdev driver=raw,node-name=drive-data-disk,file=drive_sys5 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/d4.qcow2,node-name=drive_sys6 \
-blockdev driver=qcow2,node-name=usb-disk1,file=drive_sys6 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/d5.qcow2,node-name=drive_sys7 \
-blockdev driver=qcow2,node-name=usb-disk2,file=drive_sys7 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/d6.qcow2,node-name=drive_sys8 \
-blockdev driver=qcow2,node-name=usb-disk3,file=drive_sys8 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/d7.qcow2,node-name=drive_sys9 \
-blockdev driver=qcow2,node-name=drive-usb-0,file=drive_sys9 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/d8.qcow2,node-name=drive_sys10 \
-blockdev driver=qcow2,node-name=drive-usb-1,file=drive_sys10 \
-blockdev driver=file,cache.direct=on,cache.no-flush=off,filename=/home/migration/d9.qcow2,node-name=drive_sys11 \
-blockdev driver=qcow2,node-name=drive-usb-2,file=drive_sys11 \
-sandbox off \
-chardev socket,id=charmonitor,path=/home/tmp1,server,nowait \
-chardev pty,id=charserial0 \
-chardev spicevmc,id=charchannel0,name=vdagent \
-chardev socket,id=charchannel1,path=/home/tmp2,server,nowait \
-chardev socket,id=seabioslog_id,path=/home/seabios,server,nowait \

Comment 1 jingzhao 2019-07-17 07:14:46 UTC
Mark it as regression since didn't reproduce it with qemu-kvm-3.1.0-28.module+el8.0.1+3556+b59953c6.x86_64 on destination host

Comment 3 jingzhao 2019-07-18 02:27:48 UTC
1. simple test command and always hit the issue

/usr/libexec/qemu-kvm -M pc-q35-rhel7.6.0 -monitor stdio in source host 

/usr/libexec/qemu-kvm -M pc-q35-rhel7.6.0 -monitor stdio  -incoming tcp:0:4000 in destination host

Migrate failed 
qemu-kvm: Missing section footer for 0000:00:1f.3/ich9_smb
qemu-kvm: warning: TSC frequency mismatch between VM (2397223 kHz) and host (3504000 kHz), and TSC scaling unavailable
qemu-kvm: load of migration failed: Invalid argument


2. Always hit the issue when use pc machine type
/usr/libexec/qemu-kvm -M pc-i440fx-rhel7.6.0 -monitor stdio  in source host 

/usr/libexec/qemu-kvm -M pc-i440fx-rhel7.6.0 -monitor stdio  -incoming tcp:0:4000 in destination host


qemu-kvm: Missing section footer for 0000:00:01.3/piix4_pm
qemu-kvm: warning: TSC frequency mismatch between VM (2397223 kHz) and host (3504000 kHz), and TSC scaling unavailable
qemu-kvm: load of migration failed: Invalid argument

Comment 5 Dr. David Alan Gilbert 2019-07-18 14:49:53 UTC
OK, yes that should be a blocker.

We'll need to test with the fixes from https://bugzilla.redhat.com/show_bug.cgi?id=1719649

Please:
  a) Check to see if it still happens on our 4.1.0-rc1 rebase after it gets the 1719649 fixes
  b) show the output of a guest  sudo lspci -vv  booted with the two different qemus

Comment 9 Dr. David Alan Gilbert 2019-07-26 13:54:37 UTC
with my current world I'm seeing rhel8 has gained an 'access control services' capability:
Capabilities: [148 v1] Access Control Services
		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans+
		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-

Comment 10 Dr. David Alan Gilbert 2019-07-26 14:17:22 UTC
I think the pcie problem is the addition of the ACS capability that happened between 3.1.0 and 4.0 and hasn't been switched off
for old machine types.

Comment 11 Dr. David Alan Gilbert 2019-07-30 09:54:10 UTC
I've sent two separate fixes upstream for 4.1rc3 that should fix this, in combination with the machine type fixes:

  a) Revert "hw: report invalid disable-legacy|modern usage for virtio-1-only devs"
  b) pcie_root_port: Disable ACS on older machines

Comment 12 Dr. David Alan Gilbert 2019-07-31 15:28:37 UTC
Both are now merged into upstream 4.1rc3:
c8557f1b4873549fc231 pcie_root_port: Disable ACS on older machines
a58dfba20168dae18650  pcie_root_port: Allow ACS to be disabled

92fd453c6717acbeafcb Revert "Revert "globals: Allow global properties to be optional""
dd56040d297a0c530e20 Revert "hw: report invalid disable-legacy|modern usage for virtio-1-only devs"

You'll also need the latest bz 1719649 version.

Comment 16 errata-xmlrpc 2019-11-06 07:17:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723


Note You need to log in before you can comment on or make changes to this bug.