Bug 1511312 - Migrate an VM with pci-bridge or pcie-root-port failed
Summary: Migrate an VM with pci-bridge or pcie-root-port failed
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.5
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: jingzhao
URL:
Whiteboard:
Keywords: Regression
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-09 06:43 UTC by Meina Li
Modified: 2018-04-11 00:46 UTC (History)
17 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2018-04-11 00:46:47 UTC


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:1104 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2018-04-10 22:54:38 UTC

Description Meina Li 2017-11-09 06:43:55 UTC
Description of problem:
Migrate an VM with pci-bridge or pcie-root-port failed

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.10.0-4.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
Scenario 1: Migrate VM from RHEL7.5 to RHEL7.4
1. Prepare the migrate environment.

2. Start a VM with the following xml:
......
    <controller type='pci' index='0' model='pci-root'/>
......
# virsh start lmn
Domain lmn started

3. Migrate the VM to RHEL7.4:
# virsh migrate lmn qemu+ssh://10.66.7.27/system --unsafe --verbose --live
root@10.66.7.27's password: 
Migration: [100 %]

4. Re-edit the VM with following xml and start:
......
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='pci' index='1' model='pci-bridge'/>
......
# virsh start lmn
Domain lmn started

5.Migrate the VM to RHEL7.4:
# virsh migrate lmn qemu+ssh://10.66.7.27/system --unsafe --verbose --live
root@10.66.7.27's password: 
Migration: [100 %]error: internal error: qemu unexpectedly closed the monitor: 2017-11-08T06:33:53.174967Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/1 (label charserial0)
2017-11-08T06:34:47.874295Z qemu-kvm: Missing section footer for 0000:00:01.0/pcie-root-port
2017-11-08T06:34:47.874395Z qemu-kvm: warning: TSC frequency mismatch between VM (3392295 kHz) and host (3392293 kHz), and TSC scaling unavailable
2017-11-08T06:34:47.874590Z qemu-kvm: load of migration failed: Invalid argument

6. The result also happened on q35 machine.

Actual results:
As above descriptions.

Expected results:
Migrate successfully.

Additional info:
Not reproduced on qemu-kvm-rhev-2.10.0-2.el7.x86_64

Comment 3 Qunfang Zhang 2017-11-09 09:08:26 UTC
pci-bridge issue is tracked in the following BZ:

Bug 1508271 - Migration is failed from host RHEL7.4.z to host RHEL7.5 with "-machine pseries-rhel7.4.0 -device pci-bridge,id=pci_bridge,bus=pci.0,addr=03,chassis_nr=1"

Comment 5 Juan Quintela 2017-11-13 09:46:54 UTC
Result on comment4 is really werid.  Could you doublecheck that the machine type is the same on both sides?  It looks like you are using the 7.5.0 one in one side and 7.4.0 on the other no?

Comment 6 jingzhao 2017-11-14 02:40:21 UTC
(In reply to Juan Quintela from comment #5)
> Result on comment4 is really werid.  Could you doublecheck that the machine
> type is the same on both sides?  It looks like you are using the 7.5.0 one
> in one side and 7.4.0 on the other no?

Sorry, changed the same machine type and can reproduce the issue

Hit the error
(qemu) red_dispatcher_loadvm_commands: 
qemu-kvm: Missing section footer for 0000:00:0a.0/pcie-root-port


/usr/libexec/qemu-kvm \
-M pc-q35-rhel7.4.0,accel=kvm,kernel-irqchip=split \
-device intel-iommu,intremap=on \
-cpu Haswell-noTSX \
-nodefaults -rtc base=utc \
-m 4G \
-smp 4,sockets=4,cores=1,threads=1 \
-enable-kvm \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-k en-us \
-nodefaults \
-global isa-debugcon.iobase=0x402 \
-boot menu=on \
-qmp tcp:0:6667,server,nowait \
-usb \
-device usb-tablet \
-vga qxl \
-drive file=/mnt/test/rhel75-64-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 \
-device pcie-root-port,bus=pcie.0,id=root0,multifunction=on,chassis=1,addr=0xa.0 \
-device virtio-net-pci,netdev=tap10,mac=9a:6a:6b:6c:6d:6e,bus=root0 -netdev tap,id=tap10 \
-device pcie-root-port,bus=pcie.0,id=root1,chassis=11,addr=0xa.1 \
-monitor stdio \
-vnc :1 \

Thanks
Jing

Comment 7 Dr. David Alan Gilbert 2017-11-14 17:42:14 UTC
Confirmed can trigger it with the trivial:

/usr/libexec/qemu-kvm -M pc-q35-rhel7.4.0,accel=kvm -nographic -device pcie-root-port,bus=pcie.0,id=root0 -S

Comment 8 Dr. David Alan Gilbert 2017-11-14 18:04:58 UTC
The x-migrate-msix property is false on 2.9 and true on 2.10; flipping it makes the migration work.

Comment 9 Dr. David Alan Gilbert 2017-11-14 18:40:06 UTC
This is odd because I can see:

    SET_MACHINE_COMPAT(m, PC_RHEL7_4_COMPAT);

#define PC_RHEL7_4_COMPAT \
        HW_COMPAT_RHEL7_4 \


#define HW_COMPAT_RHEL7_4 \
    { /* HW_COMPAT_RHEL7_4 */ \
        .driver   = "intel-iommu",\
        .property = "pt",\
        .value    = "off",\
    },{ /* HW_COMPAT_RHEL7_4 */ \
        .driver   = "pcie-root-port",\
        .property = "x-migrate-msix",\
        .value    = "false",\
    },

Comment 10 Dr. David Alan Gilbert 2017-11-14 20:13:43 UTC
ok, we just need to remove that last entry;  brewing.

Comment 11 Dr. David Alan Gilbert 2017-11-15 12:31:56 UTC
Downstream fixes posted:
  pcie_root_port: Fix x-migrate-msix compat
  q35: Fix mismerge

Comment 13 Miroslav Rezanina 2017-11-22 15:21:28 UTC
Fix included in qemu-kvm-rhev-2.10.0-7.el7

Comment 18 errata-xmlrpc 2018-04-11 00:46:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104


Note You need to log in before you can comment on or make changes to this bug.