Bug 2122788
| Summary: | virtio-net TX stall after packet bursts (probably in qemu) | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Stefano Brivio <sbrivio> |
| Component: | qemu-kvm | Assignee: | Laurent Vivier <lvivier> |
| qemu-kvm sub component: | Networking | QA Contact: | Lei Yang <leiyang> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | medium | CC: | chayang, coli, dgibson, eperezma, jasowang, jferlan, jinzhao, juzhang, lvivier, mrezanin, mst, virt-maint, wquan |
| Version: | 9.2 | Keywords: | Triaged |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-7.2.0-1.el9 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-05-09 07:20:04 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2135806 | ||
| Bug Blocks: | 2101375 | ||
|
Comment 1
Laurent Vivier
2022-09-08 10:43:00 UTC
(In reply to Laurent Vivier from comment #1) > I'm able to reproduce the problem on your system with your binary but if I > build QEMU from master on your system I'm not able to reproduce the problem. > > Could you provide your configure command line? I'm not sure about Stefano, but I've just been using my distro qemu, specifically qemu-system-x86-6.2.0-14.fc36.x86_64. Maybe the bug has been fixed upstream already? Sorry, I missed this comment: (In reply to Laurent Vivier from comment #1) > I'm able to reproduce the problem on your system with your binary but if I > build QEMU from master on your system I'm not able to reproduce the problem. > > Could you provide your configure command line? It's: ../configure --target-list=x86_64-softmmu based on 09ed077d7f plus the latest version of your AF_UNIX socket patchset. Feature summary (from meson.log): qemu 7.0.91 Directories Install prefix : /usr/local BIOS directory : share/qemu firmware path : share/qemu-firmware binary directory : /usr/local/bin library directory : /usr/local/lib/x86_64-linux-gnu module directory : lib/x86_64-linux-gnu/qemu libexec directory : /usr/local/libexec include directory : /usr/local/include config directory : /usr/local/etc local state directory : /var/local Manual directory : /usr/local/share/man Doc directory : /usr/local/share/doc Build directory : /home/sbrivio/qemu/build Source path : /home/sbrivio/qemu GIT submodules : ui/keycodemapdb tests/fp/berkeley-testfloat-3 tests/fp/berkeley-softfloat-3 dtc slirp Host binaries git : git make : make python : /usr/bin/python3 (version: 3.10) sphinx-build : NO gdb : /usr/bin/gdb iasl : NO genisoimage : smbd : /usr/sbin/smbd Configurable features Documentation : NO system-mode emulation : YES user-mode emulation : NO block layer : YES Install blobs : YES module support : NO fuzzing support : NO Audio drivers : oss Trace backends : log D-Bus display : NO QOM debugging : NO vhost-kernel support : YES vhost-net support : YES vhost-user support : YES vhost-user-crypto support : YES vhost-user-blk server support: YES vhost-vdpa support : YES build guest agent : YES Compilation host CPU : x86_64 host endianness : little C compiler : cc -m64 -mcx16 Host C compiler : cc -m64 -mcx16 C++ compiler : c++ -m64 -mcx16 CFLAGS : -O2 -g CXXFLAGS : -O2 -g QEMU_CFLAGS : -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong QEMU_CXXFLAGS : -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wundef -Wwrite-strings -fno-strict-aliasing -fno-common -fwrapv -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong QEMU_OBJCFLAGS : -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wno-initializer-overrides -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition -Wno-tautological-type-limit-compare -Wno-psabi QEMU_LDFLAGS : -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -fstack-protector-strong profiler : NO link-time optimization (LTO) : NO PIE : YES static build : NO malloc trim support : YES membarrier : NO debug stack usage : NO mutex debugging : NO memory allocator : system avx2 optimization : YES avx512f optimization : NO gprof enabled : NO gcov : NO thread sanitizer : NO CFI support : NO strip binaries : NO sparse : NO mingw32 support : NO Cross compilers x86_64 : cc Targets and accelerators KVM support : YES HAX support : NO HVF support : NO WHPX support : NO NVMM support : NO Xen support : NO TCG support : YES TCG backend : native (x86_64) TCG plugins : YES TCG debug enabled : NO target list : x86_64-softmmu default devices : YES out of process emulation : YES vfio-user server : NO Block layer support coroutine backend : ucontext coroutine pool : YES Block whitelist (rw) : Block whitelist (ro) : Use block whitelist in tools : NO VirtFS support : YES build virtiofs daemon : YES Live block migration : YES replication support : YES bochs support : YES cloop support : YES dmg support : YES qcow v1 support : YES vdi support : YES vvfat support : YES qed support : YES parallels support : YES FUSE exports : NO VDUSE block exports : YES Crypto TLS priority : NORMAL GNUTLS support : NO libgcrypt : NO nettle : NO AF_ALG support : NO rng-none : NO Linux keyring : YES Dependencies SDL support : NO SDL image support : NO GTK support : NO pixman : YES 0.40.0 VTE support : NO slirp support : YES 4.7.0 libtasn1 : NO PAM : NO iconv support : YES curses support : YES virgl support : NO curl support : NO Multipath support : NO PNG support : YES 1.6.37 VNC support : YES VNC SASL support : NO VNC JPEG support : YES 2.1.2 OSS support : YES ALSA support : YES 1.2.6.1 PulseAudio support : NO JACK support : NO brlapi support : NO vde support : NO netmap support : NO l2tpv3 support : YES Linux AIO support : NO Linux io_uring support : NO ATTR/XATTR support : YES RDMA support : NO PVRDMA support : NO fdt support : internal libcap-ng support : YES bpf support : NO spice protocol support : NO rbd support : NO smartcard support : NO U2F support : NO libusb : NO usb net redir : NO OpenGL support (epoxy) : NO GBM : NO libiscsi support : NO libnfs support : NO seccomp support : YES 2.5.4 GlusterFS support : NO TPM support : YES libssh support : NO lzo support : NO snappy support : NO bzip2 support : NO lzfse support : NO zstd support : NO NUMA host support : YES capstone : NO libpmem support : NO libdaxctl support : NO libudev : NO FUSE lseek : NO selinux : YES 3.3 Subprojects libvduse : YES libvhost-user : YES User defined options Native files : /home/sbrivio/qemu/build/config-meson.cross backend : ninja prefix : /usr/local werror : true Note that there are two binaries installed on my system: one if from distribution package (based on 7.0.0) in /usr/bin, the other one I described above is in /usr/local/bin (taking precedence). Would it make sense that I try and rebase the version under /usr/local/bin/ to latest upstream? From a previous analysis (Stefano?):
> virtio_net_tx_bh() calls virtio_net_flush_tx()
>
> - if qemu_sendv_packet_async() returns 0 (where returning 0
> is a summary for a number of conditions...), which for some reason
> happens less frequently with a higher tx_burst, virtio_net_flush_tx()
> returns -EBUSY
>
> - in that case, virtio_net_tx_bh() doesn't reschedule. Sure, as the
> comment says, notifications are re-enabled by the callback for
> qemu_sendv_packet_async(), i.e. virtio_net_tx_complete(), which
> however doesn't take care of rescheduling
>
> - ...not even if we were in the middle of a "burst",
> virtio_net_tx_complete() does its job and calls virtio_net_flush_tx()
> one last time, and that's it. Perhaps it's fine in most cases, I'm
> not sure why I don't hit the stall reliably on -EBUSY from
> virtio_net_flush_tx().
In the case of net/socket.c (net/stream.c with my patch series) the only reason qemu_sendv_packet_async() sends back 0 is when net_socket_receive() (net_stream_receive()) cannot send the full buffer.
A quick hack (with stream.c) that retries to send the buffer rather than to return 0 seems to fix the problem:
diff --git a/net/stream.c b/net/stream.c
index a84295f209..3ab2966bb9 100644
--- a/net/stream.c
+++ b/net/stream.c
@@ -85,7 +85,7 @@ static ssize_t net_stream_receive(NetClientState *nc, const uint8_t *buf,
size_t remaining;
ssize_t ret;
-
+again:
remaining = iov_size(iov, 2) - s->send_index;
nlocal_iov = iov_copy(local_iov, 2, iov, 2, s->send_index, remaining);
ret = qio_channel_writev(s->ioc, local_iov, nlocal_iov, NULL);
@@ -98,6 +98,7 @@ static ssize_t net_stream_receive(NetClientState *nc, const uint8_t *buf,
}
if (ret < (ssize_t)remaining) {
s->send_index += ret;
+ goto again;
s->ioc_write_tag = qio_channel_add_watch(s->ioc, G_IO_OUT,
net_stream_writable, s, NULL);
return 0;
I'm investigating why net_socket_writable() (net_stream_writable()) doesn't send the remaining data of the buffer.
Setting vectors parameter of virtio-net device to 1 seems to be another workaround: ... -device virtio-net-pci,vectors=1,... (In reply to Laurent Vivier from comment #5) > Setting vectors parameter of virtio-net device to 1 seems to be another > workaround: > > ... -device virtio-net-pci,vectors=1,... I enthusiastically tried that to replace the current workaround in passt's testsuite, but I just hit the issue (just once) even with that. It doesn't happen as frequently, though, it might be a different issue, and I didn't have means to collect more information about it on the spot. Wenli Are you able to reproduce in QE's lab? (In reply to Chao Yang from comment #7) > Wenli > > Are you able to reproduce in QE's lab? I tried 10 times with 9.2 downstream version qemu-kvm-7.1.0-1.el9.x86_64/5.14.0-70.13.1.el9_0.x86_64(host), can not reproduce issue. This is now fixed in upstream qemu by:
commit df8d07081718c29d04d106583d9c300128686cda
Author: Laurent Vivier <lvivier>
Date: Thu Oct 20 11:58:45 2022 +0200
virtio-net: fix bottom-half packet TX on asynchronous completion
and it will make its way into RHEL 9.2 together with the planned rebase to qemu 7.2: https://bugzilla.redhat.com/show_bug.cgi?id=2135806. Setting this ticket as blocked by that one.
Updating the rest of the fields necessary for inclusion by rebase QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Test with qemu-kvm-7.2.0-1.el9 10 times with steps from comment #0, the iperf3 works well and perforamnce looks good. so set it to VERIFIED. [root@dell-per440-18 ~]# iperf3 -c ${gw} -Z -P 5 -l 1M -i1 -t30 -O5 --pacing-timer 100000 - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-30.00 sec 13.8 GBytes 3.94 Gbits/sec 0 sender [ 5] 0.00-30.01 sec 13.8 GBytes 3.94 Gbits/sec receiver [ 7] 0.00-30.00 sec 13.7 GBytes 3.93 Gbits/sec 0 sender [ 7] 0.00-30.01 sec 13.7 GBytes 3.92 Gbits/sec receiver [ 9] 0.00-30.00 sec 13.7 GBytes 3.91 Gbits/sec 0 sender [ 9] 0.00-30.01 sec 13.7 GBytes 3.91 Gbits/sec receiver [ 11] 0.00-30.00 sec 13.6 GBytes 3.90 Gbits/sec 0 sender [ 11] 0.00-30.01 sec 13.6 GBytes 3.90 Gbits/sec receiver [ 13] 0.00-30.00 sec 13.3 GBytes 3.81 Gbits/sec 0 sender [ 13] 0.00-30.01 sec 13.3 GBytes 3.81 Gbits/sec receiver [SUM] 0.00-30.00 sec 68.0 GBytes 19.5 Gbits/sec 0 sender [SUM] 0.00-30.01 sec 68.1 GBytes 19.5 Gbits/sec receiver Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:2162 |