2009935 – virtio-vsock: Uperf fails to exchange goodbyes with client when many threads of size 1 are used

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2009935 - virtio-vsock: Uperf fails to exchange goodbyes with client when many threads of size 1 are used

Summary: virtio-vsock: Uperf fails to exchange goodbyes with client when many threads ...

Keywords:
Status:	CLOSED UPSTREAM
Alias:	None
Product:	Red Hat Enterprise Linux 9
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	9.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Stefano Garzarella
QA Contact:	Qinghua Cheng
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-10-02 06:10 UTC by Lukáš Doktor
Modified:	2022-06-29 01:18 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-11-26 15:11:17 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
uperf profile which is frequently failing (451 bytes, application/xml) 2021-10-02 06:10 UTC, Lukáš Doktor	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHELPLAN-98705	0	None	None	None	2021-10-02 06:13:24 UTC

Description Lukáš Doktor 2021-10-02 06:10:34 UTC

Created attachment 1828263 [details]
uperf profile which is frequently failing

Description of problem:
When reviewing the https://github.com/uperf/uperf/pull/47 I tried using multiple threads (64) and small pkt size (1), which resulted in logged errors and a missing client-throughput information of the execution. Note the exit code is still 0.

Version-Release number of selected component (if applicable):
* RHEL-9.0.0-20210930.4
* uperf f0cbb1f (from https://github.com/uperf/uperf/pull/47 )
* kernel-5.14.0-1.5.1.el9.x86_64
* on host: qemu-kvm-core-6.0.0-13.el9_b.4.x86_64

How reproducible:
* 9/10 - RHEL8->F33
* 10/10 - RHEL9->RHEL9

Steps to Reproduce:
1. Install RHEL9 on host, fetch RHEL9 cloudinit img, inject ssh keys there
2. virt-install --import --disk '/var/lib/libvirt/images/RHEL-9.0.0-20210930.4-virtlab722.DefaultLibvirt0.qcow2' --memory '20480' --name 'virtlab722.DefaultLibvirt0' --os-variant 'rhel8.0' --vcpus '10' --serial file,path='/var/log/libvirt/RHEL-9.0.0-20210930.4-virtlab722.DefaultLibvirt0.qcow2__serial.log' --vsock cid.address=42 --dry-run --print-xml > '/var/log/libvirt/RHEL-9.0.0-20210930.4-virtlab722.DefaultLibvirt0.qcow2_.xml'
3. virsh create '/var/log/libvirt/RHEL-9.0.0-20210930.4-virtlab722.DefaultLibvirt0.qcow2_.xml
4. Fetch and deploy uperf from https://github.com/uperf/uperf/pull/47 on host and guest
5. Add the attached vsock-big.xml
6. On host run: ./uperf -s -v -S vsock -P20000
7. Inside guest run: h=2 ./uperf -v -S vsock -m ../workloads/vsock-big.xml

Actual results:
Hostname            Time       Data   Throughput   Operations      Errors
----------------------------------------------------------------------------------------------------------------------
Error exchanging goodbye's with client 
Error saying goodbye with 2
master            12.28s    11.16MB     7.62Mb/s     11700192        0.00
----------------------------------------------------------------------------------------------------------------------
Difference(%)      0.00%      0.00%        -nan%        0.00%       -nan%


Expected results:
Run Statistics
Hostname            Time       Data   Throughput   Operations      Errors
----------------------------------------------------------------------------------------------------------------------
[2] Success2                 23.59s   902.99KB   313.60Kb/s       925742        0.00
master            12.22s    11.31MB     7.77Mb/s     11859680        0.00
----------------------------------------------------------------------------------------------------------------------
Difference(%)    -93.10%     92.20%       95.96%       92.19%       0.00%


Additional info:
* The reproducibility rates were tested using 96s duration but even with 10s it's failing regularly. With 1 or 2s duration it didn't fail for me in 5 iterations.
* Similarly using greater size (16) resulted in no errors
* I noticed that the throughput is slowing down during the execution. Executing it with '-R -a -i 1' uncovered periods with no transactions, attaching the output. Maybe the exchange issue is just a filled buffer waiting for the other side to be read-out.

Comment 3 Stefano Garzarella 2021-10-05 09:01:01 UTC

It could be an issue in the socket (AF_VSOCK) shutdown protocol implementation.
I'll investigate.

Moved to kernel/KVM component.

Comment 4 Qinghua Cheng 2021-10-19 08:17:17 UTC

Reproduced on rhel 9 by using vsock-big.xml workload

Host:
kernel: 5.14.0-5.el9.x86_64
qemu-kvm: qemu-kvm-6.1.0-3.el9.x86_64

Guest: 
kernel: 5.14.0-6.el9.x86_64
qemu-kvm: qemu-kvm-6.1.0-3.el9.x86_64

uperf: https://github.com/uperf/uperf

On host run: ./uperf -s -v -S vsock -P20000
Inside guest run: h=2 ./uperf -v -S vsock -m ../workloads/vsock-big.xml


Run Statistics
Hostname            Time       Data   Throughput   Operations      Errors
-------------------------------------------------------------------------------------------------------------------------------
Error exchanging goodbye's with client 
Error saying goodbye with 2
master            98.33s    40.37MB     3.44Mb/s     42336578        0.00
-------------------------------------------------------------------------------------------------------------------------------
Difference(%)      0.00%      0.00%        -nan%        0.00%       -nan%

Comment 5 Stefano Garzarella 2021-10-21 15:24:52 UTC

Thanks Qinghua for providing an environment where to replicate.

After several checks it seems to be a timeout fired in uperf.

Basically the vsock-big.xml profile creates many threads with a lot of traffic. After 96 seconds the master starts the disconnection phase stopping local threads and sending a request to get statistics from the slave.

It waits for 15 seconds (timeout wired in the uperf code) and then fails if the slave doesn't respond.
In some cases it happens that there are still a lot of packets queued for transmission, so the request to send statistics at the slave is processed late, after the timeout.

Increasing the UPERF_GOODBYE_TIMEOUT in uperf code to 30 sec helps, but in order to avoid code changes, we can put the following lines just before "disconnect" in the profile to wait a bit:

<transaction iterations="0">
    <flowop type="think" options="duration=10s idle"/>
</transaction>

I'm still not sure if the general problem is in uperf or if the high traffic in vsock generates starvation in the control socket, so the goodbay message is delayed too much. I will investigate in the future maybe comparing with a TCP socket, but I think we can close this BZ and eventually open a new one with a specific test case.

Comment 6 Stefano Garzarella 2021-11-26 15:11:17 UTC

Analyzing the code and doing some tests there is indeed a fairness problem in vsock.
The current implementation uses a single list, so if a socket queues a lot of packets, another socket can observe a long delay.

The solution is not easy and since we have a workaround, I'm closing this BZ, but I opened an issue[1] in the upstream project to keep track of this problem and solve it when we have time.

[1] https://gitlab.com/vsock/vsock/-/issues/1

Note You need to log in before you can comment on or make changes to this bug.