Bug 1262232 - self announcement and ctrl offloads does not work after migration
self announcement and ctrl offloads does not work after migration
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: jason wang
Virtualization Bugs
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-11 04:32 EDT by jason wang
Modified: 2015-12-04 11:57 EST (History)
13 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-2.3.0-28.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-04 11:57:43 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
The gdbview log that before the migration (16.88 KB, text/plain)
2015-10-20 01:42 EDT, Qian Guo
no flags Details
The gdbview log that after the migration (20.94 KB, text/plain)
2015-10-20 01:43 EDT, Qian Guo
no flags Details

  None (edit)
Description jason wang 2015-09-11 04:32:12 EDT
Description of problem:

    After commit 019a3edbb25f1571e876f8af1ce4c55412939e5d ("virtio: make
    features 64bit wide"). Device's guest_features was actually set after
    vdc->load(). This breaks the assumption that device specific load()
    function can check guest_features. For virtio-net, self announcement
    and guest offloads won't work after migration.
    
    Fixing this by defer them to virtio_net_load() where guest_features
    were guaranteed to be set. Other virtio devices looks fine

Notes: qemu-kvm-rhev commit is :

commit c9736b10ca3daefdb7284a9b39f8ec88afe7382e
Author: Xiao Wang <jasowang@redhat.com>
Date:   Tue Jul 7 09:18:24 2015 +0200

    virtio: make features 64bit wide
    
    Message-id: <1436260751-25015-22-git-send-email-jasowang@redhat.com>
    Patchwork-id: 66796
    O-Subject: [RHEL7.2 qemu-kvm-rhev PATCH V2 21/68] virtio: make features 64bit wide
    Bugzilla: 1227343
    RH-Acked-by: Michael S. Tsirkin <mst@redhat.com>
    RH-Acked-by: David Gibson <dgibson@redhat.com>
    RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
    RH-Acked-by: Thomas Huth <thuth@redhat.com>


Version-Release number of selected component (if applicable):


How reproducible:

100%

Steps to Reproduce:
1. tcpdump tap after migration
2.
3.

Actual results:
No GARP were found by tcpdump.

Expected results:
Guest should send GARP.

Additional info:
Comment 1 jason wang 2015-09-11 04:33:03 EDT
Upstream posted at http://lists.gnu.org/archive/html/qemu-devel/2015-09/msg03059.html
Comment 5 Miroslav Rezanina 2015-10-05 01:23:39 EDT
Fix included in qemu-kvm-rhev-2.3.0-28.el7
Comment 6 Qian Guo 2015-10-09 03:35:49 EDT
Hi, Jason

Test with unfixed and fixed build, got the following results:

with unfixed build:
qemu-kvm-rhev-2.3.0-24.el7.x86_64

After migration, from tcpdump the tap:

14:44:59.248334 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
14:44:59.281400 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown) tell 54:52:01:2a:0b:01 (oui Unknown), length 46
14:44:59.431372 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown) tell 54:52:01:2a:0b:01 (oui Unknown), length 46
14:44:59.474553 IP 192.168.1.11 > 192.168.1.1: ICMP echo request, id 3748, seq 31, length 64
14:44:59.474585 IP 192.168.1.1 > 192.168.1.11: ICMP echo reply, id 3748, seq 31, length 64
14:44:59.681413 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown) tell 54:52:01:2a:0b:01 (oui Unknown), length 46
14:45:00.031383 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown) tell 54:52:01:2a:0b:01 (oui Unknown), length 46




With fixed build qemu-kvm-rhev-2.3.0-29.el7.x86_64

After migration, got the packets:

14:20:58.807154 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length 28
14:20:58.807173 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
14:20:58.833400 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown) tell 54:52:01:2a:0b:01 (oui Unknown), length 46
14:20:58.856625 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length 28
14:20:58.856655 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
14:20:58.983374 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown) tell 54:52:01:2a:0b:01 (oui Unknown), length 46
14:20:59.006655 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length 28
14:20:59.006674 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
14:20:59.233394 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown) tell 54:52:01:2a:0b:01 (oui Unknown), length 46
14:20:59.256616 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length 28
14:20:59.256639 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
14:20:59.583391 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown) tell 54:52:01:2a:0b:01 (oui Unknown), length 46
14:20:59.606607 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length 28
14:20:59.606626 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32



From the result, with the fixed build, there're arp request 

"ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length 28"

Is this the fixed expected result.

Thanks,
Qian
Comment 7 jason wang 2015-10-14 21:42:45 EDT
(In reply to Qian Guo from comment #6)
> Hi, Jason
> 
> Test with unfixed and fixed build, got the following results:
> 
> with unfixed build:
> qemu-kvm-rhev-2.3.0-24.el7.x86_64
> 
> After migration, from tcpdump the tap:
> 
> 14:44:59.248334 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor
> advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
> 14:44:59.281400 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown)
> tell 54:52:01:2a:0b:01 (oui Unknown), length 46
> 14:44:59.431372 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown)
> tell 54:52:01:2a:0b:01 (oui Unknown), length 46
> 14:44:59.474553 IP 192.168.1.11 > 192.168.1.1: ICMP echo request, id 3748,
> seq 31, length 64
> 14:44:59.474585 IP 192.168.1.1 > 192.168.1.11: ICMP echo reply, id 3748, seq
> 31, length 64
> 14:44:59.681413 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown)
> tell 54:52:01:2a:0b:01 (oui Unknown), length 46
> 14:45:00.031383 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown)
> tell 54:52:01:2a:0b:01 (oui Unknown), length 46
> 
> 
> 
> 
> With fixed build qemu-kvm-rhev-2.3.0-29.el7.x86_64
> 
> After migration, got the packets:
> 
> 14:20:58.807154 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length
> 28
> 14:20:58.807173 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor
> advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
> 14:20:58.833400 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown)
> tell 54:52:01:2a:0b:01 (oui Unknown), length 46
> 14:20:58.856625 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length
> 28
> 14:20:58.856655 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor
> advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
> 14:20:58.983374 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown)
> tell 54:52:01:2a:0b:01 (oui Unknown), length 46
> 14:20:59.006655 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length
> 28
> 14:20:59.006674 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor
> advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
> 14:20:59.233394 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown)
> tell 54:52:01:2a:0b:01 (oui Unknown), length 46
> 14:20:59.256616 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length
> 28
> 14:20:59.256639 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor
> advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
> 14:20:59.583391 ARP, Reverse Request who-is 54:52:01:2a:0b:01 (oui Unknown)
> tell 54:52:01:2a:0b:01 (oui Unknown), length 46
> 14:20:59.606607 ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length
> 28
> 14:20:59.606626 IP6 fe80::5652:1ff:fe2a:b01 > ff02::1: ICMP6, neighbor
> advertisement, tgt is fe80::5652:1ff:fe2a:b01, length 32
> 
> 
> 
> From the result, with the fixed build, there're arp request 
> 
> "ARP, Request who-has 192.168.1.11 tell 192.168.1.11, length 28"
> 
> Is this the fixed expected result.

Yes.

Thanks

> 
> Thanks,
> Qian
Comment 8 Qian Guo 2015-10-14 21:45:56 EDT
According to comment 7, this bug is fixed.
Comment 9 Qian Guo 2015-10-14 22:13:55 EDT
(In reply to Qian Guo from comment #8)
> According to comment 7, this bug is fixed.

Talked with Jason, I will test the offloads part for windows guest, will update here once I got the result.

Thanks,
Qian
Comment 10 Qian Guo 2015-10-20 01:40:32 EDT
Hi, Yan & Jaosn

I followed Yan's suggestions, but I did not reproduce it with unfixed build(qemu-kvm-rhev-2.3.0-26.el7.x86_64).

I disabled the GUEST_TSO in src qemu:
-device virtio-net-pci,guest_tso4=off,guest_tso6=off,id=net0,netdev=hostnet0,mac=52:54:01:a2:0c:01

Then inside guest, disable/enable netkvm driver, then get followings from gdbview that GUST_TSOs are not loaded.

.......

00000079	5:18:42 AM	[ParaNdis_InitializeContext] Message interrupt assigned	
00000080	5:18:42 AM	[ParaNdis_ResetVirtIONetDevice] Done	
00000081	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_CSUM	
00000082	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_GUEST_CSUM	
00000083	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_MAC	
00000084	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_GSO	
00000085	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_GUEST_ECN	
00000086	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_GUEST_UFO	
00000087	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_HOST_TSO4	
00000088	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_HOST_TSO6	
00000089	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_HOST_ECN	
00000090	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_HOST_UFO	
00000091	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_MRG_RXBUF	
00000092	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_STATUS	
00000093	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_VQ	
00000094	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_RX	
00000095	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_VLAN	
00000096	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_RX_EXTRA	
00000097	5:18:42 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_MAC_ADDR	
00000098	5:18:42 AM	VirtIO Host Feature VIRTIO_F_INDIRECT	
00000099	5:18:42 AM	VirtIO Host Feature VIRTIO_F_ANY_LAYOUT	
00000100	5:18:42 AM	VirtIO Host Feature VIRTIO_RING_F_EVENT_IDX	
00000101	5:18:42 AM	[ParaNdis_InitializeContext] Link status on driver startup: 1	
00000102	5:18:42 AM	Permanent device MAC: 52-54-01-a2-0c-01	
00000103	5:18:42 AM	No valid MAC configured	
00000104	5:18:42 AM	Actual MAC: 52-54-01-a2-0c-01	


Then migration, from dst, disable/enable again, from the gdbview, the GUEST_TSOs are not loaded either.

00000079	5:20:15 AM	[ParaNdis_InitializeContext] Message interrupt assigned	
00000080	5:20:15 AM	[ParaNdis_ResetVirtIONetDevice] Done	
00000081	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_CSUM	
00000082	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_GUEST_CSUM	
00000083	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_MAC	
00000084	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_GSO	
00000085	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_GUEST_ECN	
00000086	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_GUEST_UFO	
00000087	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_HOST_TSO4	
00000088	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_HOST_TSO6	
00000089	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_HOST_ECN	
00000090	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_HOST_UFO	
00000091	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_MRG_RXBUF	
00000092	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_STATUS	
00000093	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_VQ	
00000094	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_RX	
00000095	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_VLAN	
00000096	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_RX_EXTRA	
00000097	5:20:15 AM	VirtIO Host Feature VIRTIO_NET_F_CTRL_MAC_ADDR	
00000098	5:20:15 AM	VirtIO Host Feature VIRTIO_F_INDIRECT	
00000099	5:20:15 AM	VirtIO Host Feature VIRTIO_F_ANY_LAYOUT	
00000100	5:20:15 AM	VirtIO Host Feature VIRTIO_RING_F_EVENT_IDX	
00000101	5:20:15 AM	[ParaNdis_InitializeContext] Link status on driver startup: 1	
00000102	5:20:15 AM	Permanent device MAC: 52-54-01-a2-0c-01	
00000103	5:20:15 AM	No valid MAC configured	
00000104	5:20:15 AM	Actual MAC: 52-54-01-a2-0c-01	



So, with unfixed build, the GUEST_TSOs works after before and after migration.

Maybe I missed something, could you help check the log, I will attach them.

Thanks,
Qian
Comment 11 Qian Guo 2015-10-20 01:42 EDT
Created attachment 1084600 [details]
The gdbview log that before the migration
Comment 12 Qian Guo 2015-10-20 01:43 EDT
Created attachment 1084601 [details]
The gdbview log that after the migration
Comment 13 Qian Guo 2015-10-22 23:30:52 EDT
Additional info:

I also tested with qemu-kvm-rhev-2.3.0-31.el7.x86_64, same result as comment 10.
Feel free to contact me if any further test.

Thanks,
Qian
Comment 14 Qian Guo 2015-10-23 01:36:54 EDT
Tested with unfixed and fixed builds with following scenario:

1.enable the tso(by default, and I also enabled it in the cli).
2.disable LSO4/LSO6 inside windows guest, there's no switch to control the tso feature.
3.Test netperf inside guest as client and host as netserver
*the throughput is 3032.04(10^6 bit/s)
4.migration, then test netperf again
*the throughput is 3177.41(10^6 bit/s)

Additionally, I also checked gdbview, no change before and after migration

Hi, Jason

Could you help have a check if we can set this bug as verified due to above tests and comments :)

Thanks,
Qian
Comment 15 jason wang 2015-10-23 01:59:22 EDT
(In reply to Qian Guo from comment #14)
> Tested with unfixed and fixed builds with following scenario:
> 
> 1.enable the tso(by default, and I also enabled it in the cli).
> 2.disable LSO4/LSO6 inside windows guest, there's no switch to control the
> tso feature.
> 3.Test netperf inside guest as client and host as netserver
> *the throughput is 3032.04(10^6 bit/s)
> 4.migration, then test netperf again
> *the throughput is 3177.41(10^6 bit/s)
> 
> Additionally, I also checked gdbview, no change before and after migration
> 
> Hi, Jason
> 
> Could you help have a check if we can set this bug as verified due to above
> tests and comments :)
> 
> Thanks,
> Qian

I think the bug could be verified first. Leave the windows guest control offload for future investigation.
Comment 16 juzhang 2015-10-23 02:03:55 EDT
(In reply to jason wang from comment #15)
> (In reply to Qian Guo from comment #14)
> > Tested with unfixed and fixed builds with following scenario:
> > 
> > 1.enable the tso(by default, and I also enabled it in the cli).
> > 2.disable LSO4/LSO6 inside windows guest, there's no switch to control the
> > tso feature.
> > 3.Test netperf inside guest as client and host as netserver
> > *the throughput is 3032.04(10^6 bit/s)
> > 4.migration, then test netperf again
> > *the throughput is 3177.41(10^6 bit/s)
> > 
> > Additionally, I also checked gdbview, no change before and after migration
> > 
> > Hi, Jason
> > 
> > Could you help have a check if we can set this bug as verified due to above
> > tests and comments :)
> > 
> > Thanks,
> > Qian
> 
> I think the bug could be verified first. Leave the windows guest control
> offload for future investigation.

Got it.

Hi Jason and Yan,

Any further testing need to be tested with window guest later, free to let us know.
Comment 17 juzhang 2015-10-23 02:04:56 EDT
According to comment7 and comment15, set this issue as verified.
Comment 19 errata-xmlrpc 2015-12-04 11:57:43 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html

Note You need to log in before you can comment on or make changes to this bug.