Bug 1420216

Summary: Migration from RHEL7.3.z -> RHEL4 failed with e1000e nic card
Product: Red Hat Enterprise Linux 7 Reporter: huiqingding <huding>
Component: qemu-kvm-rhevAssignee: Dr. David Alan Gilbert <dgilbert>
Status: CLOSED ERRATA QA Contact: huiqingding <huding>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: chayang, dgilbert, juzhang, knoel, michen, mrezanin, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.8.0-5.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1420195 Environment:
Last Closed: 2017-08-01 23:44:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description huiqingding 2017-02-08 08:09:07 UTC
Description of problem:
Migration failed from rhel7.3.z to rhel7.4 with e1000e nic card

Version-Release number of selected component (if applicable):
Source host(rhel7.3.z):
kernel-3.10.0-514.11.1.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.3.x86_64

Destination host(rhel7.4):
kernel-3.10.0-558.el7.x86_64
qemu-kvm-rhev-2.8.0-3.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot a rhel7.3 guest in source host:
# /usr/libexec/qemu-kvm \
-name rhel7 \
-S -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off \
-m 2048 \
-cpu Opteron_G4,check \
-realtime mlock=off \
-smp 4,maxcpus=4,sockets=4,cores=1,threads=1 \
-uuid 49a3438a-70a3-4ba8-92ce-3a05e0934608 \
-nodefaults \
-rtc base=utc,driftfix=slew \
-boot order=c,menu=on,strict=on \
-drive file=/mnt/rhel7.3.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=f65effa5-90a6-47f2-8487-a9f64c95d4f5,cache=none,discard=unmap,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 \
-netdev tap,id=hostnet2,vhost=on,script=/etc/qemu-ifup \
-device e1000e,netdev=hostnet2,id=virtio-net-pci2,mac=4e:63:28:bc:c1:75,bus=pci.0,addr=0x5,multifunction=off \
-monitor stdio \
-qmp tcp:0:4466,server,nowait -serial unix:/tmp/ttym,server,nowait \
-spice port=5910,addr=0.0.0.0,disable-ticketing,seamless-migration=on \
-device qxl-vga,id=video0,ram_size=134217728,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 \

2.boot the guest in destination host with "-incoming tcp:0:5800"

3.do migration
(qemu)migration -d tcp:10.73.72.58:5800

Actual results:
after step3, migration failed and destination qemu-kvm quits with
(qemu) ERROR: Cannot migrate while device properties (subsys/subsys_ven) differqemu-kvm: error while loading state for instance 0x0 of device '0000:00:05.0/e1000e'
qemu-kvm: load of migration failed: Operation not permitted

Expected results:
migration can be finished normally.

Additional info:

Comment 1 huiqingding 2017-02-08 09:21:14 UTC
Do vmstate check and the result is as following:
# python vmstate-static-checker.py -s rhel7.3.json_7.3 -d rhel7.4.json_7.3 
Section "e1000e", Description "e1000e": expected field "intr_state", got "core.rxbuf_min_shift"; skipping rest
Section "rtl8139", Description "rtl8139": expected field "tally_counters", got "tally_counters.TxOk"; skipping rest

Comment 3 Dr. David Alan Gilbert 2017-02-09 09:44:00 UTC
OK, can reproduce this, easiest test is:

/usr/libexec/qemu-kvm -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off -device e1000e,multifunction=off -nographic

Comment 4 Dr. David Alan Gilbert 2017-02-09 10:48:14 UTC
Yes, the 'intr_state' field spotted by the checker is the problem.

7.3 (based on 2.6) backported e1000e by pulling in commits:
   6f3fbe4ed06a54b14716e619f1053c073fddae49
   103916cbe923f08080717942f1b5f9c4eb74aa11

that went in upstream after 2.6.
However before 2.7.0's release another two commits were added to e1000e:
   e0af5a0e8b74c674d29be3224b7ec16ba278e99c
   66bf7d58d830e6370895e4f1bb1257d135661872

which removed the need for that field.  Since these went in before 2.7.0 was released there's no upstream compatibility problem since it was all new in 2.7.0.

Comment 5 Dr. David Alan Gilbert 2017-02-09 13:08:15 UTC
Posted downstream fix.

[RHEL-7.4 qemu-kvm-rhev PATCH 1/1] migcompat/e1000e: Work around 7.3 msi/intr_state field

QE: It probably needs testing on a few different OSs and machine types etc to make sure network traffic is OK after migration.

Comment 6 Miroslav Rezanina 2017-02-20 10:07:35 UTC
Fix included in qemu-kvm-rhev-2.8.0-5.el7

Comment 11 huiqingding 2017-05-11 07:44:24 UTC
Based on comment #8 and #10, set this bug to be verified and continue to track bz1447935.

Comment 13 errata-xmlrpc 2017-08-01 23:44:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 14 errata-xmlrpc 2017-08-02 01:22:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 15 errata-xmlrpc 2017-08-02 02:14:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 16 errata-xmlrpc 2017-08-02 02:55:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 17 errata-xmlrpc 2017-08-02 03:19:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 18 errata-xmlrpc 2017-08-02 03:37:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392