Bug 1378334

Summary: windows guests migration from rhel6.8-z to rhel7.3 with virtio-net-pci fail
Product: Red Hat Enterprise Linux 7 Reporter: lijin <lijin>
Component: qemu-kvm-rhevAssignee: Dr. David Alan Gilbert <dgilbert>
Status: CLOSED ERRATA QA Contact: huiqingding <huding>
Severity: high Docs Contact: Jiri Herrmann <jherrman>
Priority: high    
Version: 7.3CC: ailan, chayang, dgilbert, hhuang, huding, jen, jsuchane, juzhang, knoel, lijin, lprosek, michal.skrivanek, mrezanin, mst, mtessun, pbonzini, phou, sherold, snagar, virt-bugs, virt-maint, wyu, ykaul, ymankad, yvugenfi
Target Milestone: rcKeywords: Regression, ZStream
Target Release: 7.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.8.0-1.el7 Doc Type: Bug Fix
Doc Text:
Attempting to migrate a Windows guest virtual machine that was using the virtio-net-pci device from a Red Hat Enterprise Linux (RHEL) 6 host to a RHEL 7.3 host previously caused the guest to terminate unexpectedly, because the ctrl_guest_offloads feature was disabled on the destination host. This update enables ctrl_guest_offloads on the destination host, and the described migration works as expected.
Story Points: ---
Clone Of:
: 1392876 (view as bug list) Environment:
Last Closed: 2017-08-01 23:34:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1384587, 1392876, 1395265, 1401400    

Description lijin 2016-09-22 07:45:39 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.boot win8-32 guest on rhel6.8-z with netkvm device:
-netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:52:4c:23:4d:90

2.install netkvm driver in virtio-win-1.9.0-3.el7.noarch

3.start listening guest on rhel7.3 host:

4.migrate guest from rhel6.8-z to rhel7.3


Actual results:
After step4,migration failed with:
qemu-kvm: Features 0x301f99a7 unsupported. Allowed features: 0x719fffe3
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:05.0/virtio-net'
copying E and F segments from pc.bios to pc.ram
copying C and D segments from pc.rom to pc.ram
qemu-kvm: load of migration failed: Operation not permitted


Expected results:
migration can success

Additional info:
I tried with build110(rhel7.2 virtio-win package),NOT hit this issue,so it's a regression.

Comment 1 lijin 2016-09-22 07:48:41 UTC
package info :
rhel6.8-z: 
qemu-kvm-rhev-0.12.1.2-2.491.el6_8.3 
kernel-2.6.32-642.8.1.el6
seabios-0.6.1.2-30.el6

rhel7.3: 
qemu-kvm-rhev-2.6.0-26.el7
kernel-3.10.0-510.el7
seabios-1.9.1-5.el7
virtio-win-1.9.0-3.el7

Comment 4 Yu Wang 2016-09-22 09:18:19 UTC
win10-32 hit the same issue 

Thanks
Yu Wang

Comment 5 Paolo Bonzini 2016-09-26 15:28:05 UTC
Please include the full QEMU command line (comment 0 has incomplete command line).

Comment 6 lijin 2016-09-27 02:56:58 UTC
(In reply to Paolo Bonzini from comment #5)
> Please include the full QEMU command line (comment 0 has incomplete command
> line).

full qemu cli:
/usr/libexec/qemu-kvm -enable-kvm -m 2G -smp 2 -M rhel6.6.0 -nodefconfig -nodefaults -cpu SandyBridge -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=en_windows_8_enterprise_x86_dvd_917587.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,drive=drive-ide0-1-0,id=ide0-1-0,bus=ide.0,unit=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -cdrom virtio-win-1.9.0.iso -monitor stdio -qmp tcp:0:4444,server,nowait -vnc 0.0.0.0:0 -vga cirrus -drive file=win8-32-rhel6u8.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device virtio-blk-pci,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:52:4c:23:4d:90

Comment 7 Yvugenfi@redhat.com 2016-09-27 08:41:35 UTC
Hi Michael,

Are you aware why VIRTIO_NET_F_CTRL_GUEST_OFFLOADS was removed from RHEL7.3?

Thanks,
Yan.

Comment 8 Yvugenfi@redhat.com 2016-09-27 08:45:48 UTC
Can this commit be an issue:

From 1a261d8dc6205bfd88ab45a906691e0a3819c0cb Mon Sep 17 00:00:00 2001
From: "Dr. David Alan Gilbert" <dgilbert>
Date: Mon, 15 Aug 2016 10:40:18 +0200
Subject: [PATCH 17/17] Revert "virtio-net: unbreak self announcement and guest
 offloads after migration"

RH-Author: Dr. David Alan Gilbert <dgilbert>
Message-id: <1471257618-19311-3-git-send-email-dgilbert>
Patchwork-id: 71958
O-Subject: [RHEL-7.3 qemu-kvm-rhev PATCH 2/2] Revert "virtio-net: unbreak self announcement and guest offloads after migration"
Bugzilla: 1365747
RH-Acked-by: Marcel Apfelbaum <marcel>
RH-Acked-by: Xiao Wang <jasowang>
RH-Acked-by: Michael S. Tsirkin <mst>

From: "Michael S. Tsirkin" <mst>

This reverts commit 1f8828ef573c83365b4a87a776daf8bcef1caa21.

Comment 9 Jaroslav Suchanek 2016-09-27 12:41:23 UTC
(In reply to lijin from comment #0)

> 4.migrate guest from rhel6.8-z to rhel7.3
> 
> 
> Actual results:
> After step4,migration failed with:
> qemu-kvm: Features 0x301f99a7 unsupported. Allowed features: 0x719fffe3
> qemu-kvm: error while loading state for instance 0x0 of device
> '0000:00:05.0/virtio-net'
> copying E and F segments from pc.bios to pc.ram
> copying C and D segments from pc.rom to pc.ram
> qemu-kvm: load of migration failed: Operation not permitted
> 
> 

Is the guest running on the source host after the failure?

Comment 10 Dr. David Alan Gilbert 2016-09-27 12:44:53 UTC
I'd rather check with mst on this; I just backported that after identifying it as the likely problem in my original bz.

Comment 11 Ladi Prosek 2016-09-27 14:32:31 UTC
I have installed qemu-kvm-rhev-2.6.0-26.el7 on my test host and can successfully load/restore a Windows VM with the VIRTIO_NET_F_CTRL_GUEST_OFFLOADS feature negotiated. The only way to get the error mentioned in comment #0 for me is by explicitly disabling ctrl_guest_offloads by adding ctrl_guest_offloads=false to the command line.

qemu-kvm: Features 0x1301f8024 unsupported. Allowed features: 0x179bf8060
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-net'

Note that the missing bit here is also 2 (mask 0x04).

Li Jin, does this reproduce if you perform a "pseudo-migration" [1] - just save the VM state into a file and then load it back? If so, would it be possible to share the state file with us? Thanks!

[1] http://www.linux-kvm.org/page/Migration#savevm.2Floadvm_to_an_external_state_file_.28using_pseudo-migration.29

Comment 19 Michael S. Tsirkin 2016-09-27 19:46:11 UTC
cleared needinfo by mistake - put it back

Comment 21 lijin 2016-09-28 01:43:42 UTC
(In reply to Jaroslav Suchanek from comment #9)
> (In reply to lijin from comment #0)
> 
> > 4.migrate guest from rhel6.8-z to rhel7.3
> > 
> > 
> > Actual results:
> > After step4,migration failed with:
> > qemu-kvm: Features 0x301f99a7 unsupported. Allowed features: 0x719fffe3
> > qemu-kvm: error while loading state for instance 0x0 of device
> > '0000:00:05.0/virtio-net'
> > copying E and F segments from pc.bios to pc.ram
> > copying C and D segments from pc.rom to pc.ram
> > qemu-kvm: load of migration failed: Operation not permitted
> > 
> > 
> 
> Is the guest running on the source host after the failure?

guest is in paused status on the source host,after continue it,guest is alive.

Comment 22 lijin 2016-09-28 02:26:00 UTC
(In reply to Ladi Prosek from comment #11)
> I have installed qemu-kvm-rhev-2.6.0-26.el7 on my test host and can
> successfully load/restore a Windows VM with the
> VIRTIO_NET_F_CTRL_GUEST_OFFLOADS feature negotiated. The only way to get the
> error mentioned in comment #0 for me is by explicitly disabling
> ctrl_guest_offloads by adding ctrl_guest_offloads=false to the command line.
> 
> qemu-kvm: Features 0x1301f8024 unsupported. Allowed features: 0x179bf8060
> qemu-kvm: error while loading state for instance 0x0 of device
> '0000:00:03.0/virtio-net'

I can also reproduce this issue when do local migration on rhel7.3 host when migrate from ctrl_guest_offloads=true to ctrl_guest_offloads=false

> Note that the missing bit here is also 2 (mask 0x04).
> 
> Li Jin, does this reproduce if you perform a "pseudo-migration" [1] - just
> save the VM state into a file and then load it back? If so, would it be
> possible to share the state file with us? Thanks!

Yes,can reproduce when perform pseudo-migration
src:
/usr/libexec/qemu-kvm -enable-kvm -m 2G -smp 2 -M pc -nodefconfig -nodefaults -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=en_windows_8_enterprise_x86_dvd_917587.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,drive=drive-ide0-1-0,id=ide0-1-0,bus=ide.0,unit=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -cdrom virtio-win-1.7.5.iso -monitor stdio -qmp tcp:0:4444,server,nowait -vnc 0.0.0.0:0 -vga cirrus -object iothread,id=thread0 -drive file=win8-32-rhel7u3.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device virtio-blk-pci,iothread=thread0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:52:4c:23:6d:00,ctrl_guest_offloads=true
(qemu) stop  
(qemu) migrate_set_speed 4095m                 
(qemu) migrate "exec:gzip -c > STATEFILE.gz"      


dst:
/usr/libexec/qemu-kvm -enable-kvm -m 2G -smp 2 -M pc -nodefconfig -nodefaults -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=en_windows_8_enterprise_x86_dvd_917587.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,drive=drive-ide0-1-0,id=ide0-1-0,bus=ide.0,unit=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -cdrom virtio-win-1.7.5.iso -monitor stdio -qmp tcp:0:4445,server,nowait -vnc 0.0.0.0:1 -vga cirrus -object iothread,id=thread0 -drive file=win8-32-rhel7u3.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device virtio-blk-pci,iothread=thread0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:52:4c:23:6d:00,ctrl_guest_offloads=off -incoming "exec: gzip -c -d STATEFILE.gz"
char device redirected to /dev/pts/7 (label charserial0)
QEMU 2.6.0 monitor - type 'help' for more information
(qemu) qemu-kvm: Features 0x1301f99a7 unsupported. Allowed features: 0x179bfffe3
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-net'
gzip: stdout: Broken pipe
qemu-kvm: load of migration failed: Operation not permitted

the STATEFILE.gz can be found in http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/virtio-win/bug1378334/

Comment 35 Jeff Nelson 2016-09-30 14:16:55 UTC
The work to solve this problem as described in comment 30 item (c) is in qemu-kvm-rhev, so changing the component to match.

Note that this bug affects qemu-kvm as well as qemu-kvm-rhev, but we only plan to fix qemu-kvm-rhev. That's because the impact of the bug is only on Windows guests and Windows guests aren't supported by qemu-kvm.

Comment 39 Peixiu Hou 2016-10-11 07:39:18 UTC
Verified this bug with the qemu(https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11881570). 

Steps as comment#0
Migrate guest from rhel6.8-z to rhel7.3 successfully with virtio-net-pci. The bug is fixed~


Best Regards~
Peixiu Hou

Comment 48 huiqingding 2017-02-22 07:43:15 UTC
Reproduce this bug using:
Source host (rhel6.8.z):
kernel-2.6.32-642.4.1.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.491.el6_8.7.x86_64

Destination host (rhel7.3.0):
kernel-3.10.0-514.12.1.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64

Reproduce steps:
1.boot win8-32 guest on rhel6.8-z with netkvm device:
# /usr/libexec/qemu-kvm \
-enable-kvm \
-m 2G \
-smp 2 \
-M rhel6.6.0 \
-nodefconfig \
-nodefaults \
-cpu SandyBridge \
-rtc base=localtime,driftfix=slew \
-boot order=cd,menu=on \
-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2  \
-chardev pty,id=charserial0 \
-device isa-serial,chardev=charserial0,id=isa_serial0 \
-device usb-tablet,id=input0 \
-cdrom /mnt/virtio-win-1.9.0.iso \
-monitor stdio \
-qmp tcp:0:4444,server,nowait \
-vnc 0.0.0.0:0 \
-vga cirrus \
-drive file=win8-32.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none \
-device virtio-blk-pci,drive=drive-ide0-0-0,id=ide0-0-0 \
-netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:52:4c:23:4d:90

2.install netkvm driver in virtio-win-1.9.0-3.el7.noarch

3.start listening guest on rhel7.3 host:

4.migrate guest from rhel6.8-z to rhel7.3


Actual results:
After step4,migration failed with:
qemu-kvm: Features 0x301f99a7 unsupported. Allowed features: 0x719fffe3
qemu-kvm: error while loading state for instance 0x0 of device '0000:00:05.0/virtio-net'
copying E and F segments from pc.bios to pc.ram
copying C and D segments from pc.rom to pc.ram
qemu-kvm: load of migration failed: Operation not permitted

Verify this bug using:
Source host (rhel6.8.z):
kernel-2.6.32-642.4.1.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.491.el6_8.7.x86_64

Destination host (rhel7.3.0):
kernel-3.10.0-572.el7.x86_64
qemu-kvm-rhev-2.8.0-5.el7.x86_64

Use the same steps and do migration from rhel6.8.z->rhel7.4, migration can finish normally and guest can ping other host.

Comment 50 huiqingding 2017-02-22 08:01:42 UTC
Based on comment #48, set this bug to be verified.

Comment 52 errata-xmlrpc 2017-08-01 23:34:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 53 errata-xmlrpc 2017-08-02 01:12:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 54 errata-xmlrpc 2017-08-02 02:04:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 55 errata-xmlrpc 2017-08-02 02:45:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 56 errata-xmlrpc 2017-08-02 03:09:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 57 errata-xmlrpc 2017-08-02 03:29:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392