1263591 – Guest network works abnormally(ping out or netperf test failed) when use multi queue of the virtio-net-pci macvtap

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1263591 - Guest network works abnormally(ping out or netperf test failed) when use multi queue of the virtio-net-pci macvtap

Summary: Guest network works abnormally(ping out or netperf test failed) when use mult...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	qemu-kvm-rhev
Sub Component:
Version:	7.2
Hardware:	ppc64le
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	7.3
Assignee:	Laurent Vivier
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1288337 RHV4.1PPC
TreeView+	depends on / blocked

Reported:	2015-09-16 09:07 UTC by Gu Nini
Modified:	2016-11-07 20:38 UTC (History)
CC List:	20 users (show)
Fixed In Version:	qemu-2.6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-11-07 20:38:29 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
IBM Linux Technology Center	135276	0	None	None	None	2019-02-18 02:40:49 UTC
Red Hat Product Errata	RHBA-2016:2673	0	normal	SHIPPED_LIVE	qemu-kvm-rhev bug fix and enhancement update	2016-11-08 01:06:13 UTC

Description Gu Nini 2015-09-16 09:07:29 UTC

Description of problem:
If boot up a guest with macvtap virtual net port, after set multi queue for it, it failed to ping out.  

Version-Release number of selected component (if applicable):
Host kernel: 3.10.0-306.0.1.el7.ppc64le
Guest kernel: 3.10.0-229.14.1.el7.ppc64
Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-22.el7.ppc64le

How reproducible:
100%


Steps to Reproduce:
1. Create a macvtap on host net port enP3p9s0f0:
# ip link add link enP3p9s0f0 name macvtap0 type macvtap mode bridge
# ip link set macvtap0 address c2:ac:d3:c7:c4:0f up
# ll /dev/tap*
crw-------. 1 root root 247, 1 Sep  9 22:33 /dev/tap672

2. Start a guest with the macvtap and set multi queue as on:
/usr/libexec/qemu-kvm -name spaprraw-0910 -machine pseries,accel=kvm,usb=off -m 2048M -realtime mlock=off -smp 4,sockets=1,cores=4,threads=1 -uuid 95346a10-1828-403a-a610-ac5a52a29416 -no-user-config -nodefaults -monitor stdio -rtc base=utc,clock=host -no-shutdown -boot strict=on -device pci-ohci,id=ohci0 -device spapr-vscsi,id=scsi0,reg=0x1000 -drive file=/home/spaprraw-0910,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -chardev pty,id=charserial0 -device spapr-vty,chardev=charserial0,reg=0x30000000 -vnc 0:06 -device VGA,id=video0,bus=pci.0,addr=0x8 -msg timestamp=on -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 ****-device virtio-net-pci,any_layout=on,netdev=macvtap0,mac=c2:ac:d3:c7:c4:0f,id=net1,vectors=10,mq=on 678<>/dev/tap672 679<>/dev/tap672 680<>/dev/tap672 681<>/dev/tap672 -netdev tap,id=macvtap0,vhost=on,fds=678:679:680:681****

3. After the guest boots up, check the guest net port eth0 channel parameters, and try to ping out to some external host:
# ethtool -l eth0
Channel parameters for eth0:
Pre-set maximums:
RX:		0
TX:		0
Other:		0
Combined:	4
Current hardware settings:
RX:		0
TX:		0
Other:		0
Combined:	1
# ping 10.16.67.19
PING 10.16.67.19 (10.16.67.19) 56(84) bytes of data.
64 bytes from 10.16.67.19: icmp_seq=1 ttl=61 time=0.251 ms
64 bytes from 10.16.67.19: icmp_seq=2 ttl=61 time=0.195 ms
64 bytes from 10.16.67.19: icmp_seq=3 ttl=61 time=0.195 ms
64 bytes from 10.16.67.19: icmp_seq=4 ttl=61 time=0.211 ms
......

4. Stop the ping progress in step3, enable multi queues to be 2 for eth0, then try to ping out to the external host again:
# ethtool -L eth0 combined 2
# ethtool -l eth0
Channel parameters for eth0:
Pre-set maximums:
RX:		0
TX:		0
Other:		0
Combined:	4
Current hardware settings:
RX:		0
TX:		0
Other:		0
Combined:	2
#ping 10.16.67.19
PING 10.16.67.19 (10.16.67.19) 56(84) bytes of data.
64 bytes from 10.16.67.19: icmp_seq=1 ttl=61 time=0.516 ms
From 10.19.106.248 icmp_seq=39 Destination Host Unreachable
From 10.19.106.248 icmp_seq=40 Destination Host Unreachable
From 10.19.106.248 icmp_seq=41 Destination Host Unreachable
64 bytes from 10.16.67.19: icmp_seq=72 ttl=61 time=0.203 ms
64 bytes from 10.16.67.19: icmp_seq=82 ttl=61 time=0.273 ms
64 bytes from 10.16.67.19: icmp_seq=84 ttl=61 time=0.209 ms
64 bytes from 10.16.67.19: icmp_seq=100 ttl=61 time=0.271 ms
64 bytes from 10.16.67.19: icmp_seq=104 ttl=61 time=0.218 ms
64 bytes from 10.16.67.19: icmp_seq=108 ttl=61 time=0.264 ms
64 bytes from 10.16.67.19: icmp_seq=116 ttl=61 time=0.245 ms
64 bytes from 10.16.67.19: icmp_seq=132 ttl=61 time=0.312 ms
ping: sendmsg: No uffer space availabe
ping: sendmsg: No uffer space availabe
ping: sendmsg: No uffer space availabe
ping: sendmsg: No uffer space availabe
ping: sendmsg: No uffer space availabe
......

5. Stop the ping progress in step4, reset queues to be 1 for eth0, then try to ping out to the external host again:
# ethtool -L eth0 combined 1
# ethtool -l eth0
Channel parameters for eth0:
Pre-set maximums:
RX:		0
TX:		0
Other:		0
Combined:	4
Current hardware settings:
RX:		0
TX:		0
Other:		0
Combined:	1
#ping 10.16.67.19
PING 10.16.67.19 (10.16.67.19) 56(84) bytes of data.
PING 10.16.67.19 (10.16.67.19) 56(84) bytes of data.
64 bytes from 10.16.67.19: icmp_seq=1 ttl=61 time=0.269 ms
64 bytes from 10.16.67.19: icmp_seq=2 ttl=61 time=0.192 ms
64 bytes from 10.16.67.19: icmp_seq=3 ttl=61 time=0.198 ms
64 bytes from 10.16.67.19: icmp_seq=4 ttl=61 time=0.209 ms
......


Actual results:
In step4, after set multi queue for eth0, the guest ping out failed; most of the times, the packets lost rate would be 100%, although there is a few chance it can ping out for a few moment as showed in the step.
In steps 2 and 5, when the queue of eth0 is 1, guest can ping out without any problem.

Expected results:
With multi queue, the net port could work correctly without any problem

Additional info:

Comment 1 Gu Nini 2015-09-16 09:12:05 UTC

Have tried to start the guest with '-netdev tap,id=hostnet0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:c4:e7:16,vectors=10,mq=on' instead of the macvtap on another power pc host, there is no the bug problem with the same steps.

Comment 3 Laurent Vivier 2015-09-17 19:52:54 UTC

Seems to be an "endian" issue, as a ppc64le guest works well with a ppc64le host.

Host kernel:   3.10.0-315.el7.ppc64le
Guest kernel:  3.10.0-315.el7.ppc64le
Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-23.el7.ppc64le

Comment 4 Laurent Vivier 2015-09-17 23:28:47 UTC

I'm not able to reproduce it with ppc64 guest and ppc64le host.

Could you check with the latest releases, please ?

Host kernel:   3.10.0-316.el7.ppc64le
Guest kernel:  3.10.0-316.el7.ppc64
Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-23.el7.ppc64le

Comment 5 Gu Nini 2015-09-18 09:34:26 UTC

(In reply to Laurent Vivier from comment #4)
> I'm not able to reproduce it with ppc64 guest and ppc64le host.
> 
> Could you check with the latest releases, please ?
> 
> Host kernel:   3.10.0-316.el7.ppc64le
> Guest kernel:  3.10.0-316.el7.ppc64
> Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-23.el7.ppc64le


It's a pity that I have no available host today.

In my later test, sometimes the ping is ok, but if I used 'netperf -H 10.16.67.19 -l 300', it failed. So can you check the netperf test to and external host as the ping one? If you still could not reproduce the bug, I will tried it on the latest releases. 

Change the bug summary accordingly.

Comment 6 David Gibson 2015-09-21 00:30:38 UTC

Not yet set as exception or blocker, and I don't see an immediate cause to.

Therefore, bumping to 7.3.

Comment 7 Laurent Vivier 2015-10-02 12:55:37 UTC

I'm able to reproduce the bug with netperf. thanks.

Comment 8 Laurent Vivier 2015-10-07 22:15:54 UTC

It is not specific to RHEL kernel/qemu as I have been able to reproduce it with upstream kernel/qemu (host/guest kernel 4.2 and qemu 2.4.50 5fdb467).

Comment 10 Laurent Vivier 2016-01-13 16:50:06 UTC

The endianess is only set for the first tap device, this is why when we have more tap devices it cannot work. The fix is as easy as this:

--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -311,12 +311,11 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
         goto err;
     }
 
-    r = vhost_net_set_vnet_endian(dev, ncs[0].peer, true);
-    if (r < 0) {
-        goto err;
-    }
-
     for (i = 0; i < total_queues; i++) {
+        r = vhost_net_set_vnet_endian(dev, ncs[i].peer, true);
+        if (r < 0) {
+            goto err;
+        }
         vhost_net_set_vq_index(get_vhost_net(ncs[i].peer), i * 2);
     }
 
But a work is currently ongoing to move this to the vnet backend. I need to investigate more to see how to integrate this change in it.

Comment 11 Greg Kurz 2016-01-13 19:27:28 UTC

I've put my series on hold. Please proceed with your fix.

Comment 12 Laurent Vivier 2016-01-14 12:04:50 UTC

Patch sent:

http://patchwork.ozlabs.org/patch/567120/

Comment 13 Greg Kurz 2016-01-27 09:56:29 UTC

Hi Laurent,

This patch got 3 R-b tags on qemu-devel@... Is there anything that prevents upstream acceptance ?

Thanks.

--
Greg

Comment 14 Laurent Vivier 2016-01-27 09:59:05 UTC

(In reply to Greg Kurz from comment #13)
> Hi Laurent,
> 
> This patch got 3 R-b tags on qemu-devel@... Is there anything that prevents
> upstream acceptance ?

No, but while it is not in a maintainer branch, or better in the master branch, we cannot be sure it will be taken.

Comment 18 Laurent Vivier 2016-02-08 13:22:50 UTC

Now upstream.

a407644 net: set endianness on all backend devices

Comment 19 Mike McCune 2016-03-28 22:58:03 UTC

This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 21 Zhengtong 2016-05-30 05:29:47 UTC

According to the comment #0 , I can reproduce the problem , with the packages below:
Host kernel:3.10.0-418.el7.ppc64le
qemu-kvm-rhev-2.5.0-4.el7
Guest kernel: 3.10.0-418.el7.ppc64

After upgrading the version of qemu-kvm-rhev to qemu-kvm-rhev-2.6.0-4.el7. The problem didn't show up.  So put the bug to be verified.

Details:

1. Add macvtap device in host:
[root@ibm-p8-rhevm-10 test]# ip link add link enP3p9s0f0 name macvtap0 type macvtap mode bridge
[root@ibm-p8-rhevm-10 test]# ip link set macvtap0 address c2:ac:d3:c7:c4:0f up

2. Boot guest with the tap device:
/usr/libexec/qemu-kvm ...
-device virtio-net-pci,any_layout=on,netdev=macvtap0,mac=c2:ac:d3:c7:c4:0f,id=net1,vectors=10,mq=on 678<>/dev/tap7 679<>/dev/tap7 680<>/dev/tap7 681<>/dev/tap7 \
-netdev tap,id=macvtap0,vhost=on,fds=678:679:680:681
...

3. After guest boot up.Check mq configuration :
[root@dhcp71-27 ~]# ethtool -l eth0
Channel parameters for eth0:
Pre-set maximums:
RX:		0
TX:		0
Other:		0
Combined:	4
Current hardware settings:
RX:		0
TX:		0
Other:		0
Combined:	1

ping external host ip:
[root@dhcp71-27 ~]# ping 10.16.67.19
PING 10.16.67.19 (10.16.67.19) 56(84) bytes of data.
64 bytes from 10.16.67.19: icmp_seq=1 ttl=61 time=0.295 ms
64 bytes from 10.16.67.19: icmp_seq=2 ttl=61 time=0.256 ms
64 bytes from 10.16.67.19: icmp_seq=3 ttl=61 time=0.252 ms
64 bytes from 10.16.67.19: icmp_seq=4 ttl=61 time=0.250 ms
64 bytes from 10.16.67.19: icmp_seq=5 ttl=61 time=0.246 ms
64 bytes from 10.16.67.19: icmp_seq=6 ttl=61 time=0.263 ms


4. stop pinging process ,and change the mq configuration

[root@dhcp71-27 ~]# ethtool -L eth0 combined 2

5. start ping out again
[root@dhcp71-27 ~]# ping 10.16.67.19
PING 10.16.67.19 (10.16.67.19) 56(84) bytes of data.
64 bytes from 10.16.67.19: icmp_seq=1 ttl=61 time=0.321 ms
64 bytes from 10.16.67.19: icmp_seq=2 ttl=61 time=0.292 ms
64 bytes from 10.16.67.19: icmp_seq=3 ttl=61 time=0.270 ms
64 bytes from 10.16.67.19: icmp_seq=4 ttl=61 time=0.269 ms
64 bytes from 10.16.67.19: icmp_seq=5 ttl=61 time=0.272 ms
....
64 bytes from 10.16.67.19: icmp_seq=128 ttl=61 time=0.252 ms
64 bytes from 10.16.67.19: icmp_seq=129 ttl=61 time=0.262 ms
64 bytes from 10.16.67.19: icmp_seq=130 ttl=61 time=0.265 ms
64 bytes from 10.16.67.19: icmp_seq=131 ttl=61 time=0.271 ms
64 bytes from 10.16.67.19: icmp_seq=132 ttl=61 time=0.247 ms
64 bytes from 10.16.67.19: icmp_seq=133 ttl=61 time=0.262 ms

This step last for a little long time. and all the packets transmitted successfully.

6. Stop pinging process and change the mq back to one.
[root@dhcp71-27 ~]# ethtool -L eth0 combined 1

7. Ping out again.
[root@dhcp71-27 ~]# ping 10.16.67.19
PING 10.16.67.19 (10.16.67.19) 56(84) bytes of data.
64 bytes from 10.16.67.19: icmp_seq=1 ttl=61 time=0.314 ms
64 bytes from 10.16.67.19: icmp_seq=2 ttl=61 time=0.213 ms
64 bytes from 10.16.67.19: icmp_seq=3 ttl=61 time=0.219 ms
64 bytes from 10.16.67.19: icmp_seq=4 ttl=61 time=0.245 ms
64 bytes from 10.16.67.19: icmp_seq=5 ttl=61 time=0.219 ms
64 bytes from 10.16.67.19: icmp_seq=6 ttl=61 time=0.209 ms
64 bytes from 10.16.67.19: icmp_seq=7 ttl=61 time=0.209 ms


The result is good.

Comment 23 errata-xmlrpc 2016-11-07 20:38:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html

Note You need to log in before you can comment on or make changes to this bug.

bugproxy
danken
dgibson
gkurz
hannsj_uhl
huding
juzhang
knoel
lvivier
michal.skrivanek
michen
mrezanin
ngu
qzhang
thuth
virt-maint
xfu
xuhan
xuma
zhengtli