Bug 1793327
Summary: | "qemu-kvm: Failed to read from slave." shows when boot qemu vhost-user 2 queues over dpdk 19.11[rhel8.2-av] | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Pei Zhang <pezhang> | |
Component: | qemu-kvm | Assignee: | Adrián Moreno <amorenoz> | |
qemu-kvm sub component: | General | QA Contact: | Pei Zhang <pezhang> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | amorenoz, chayang, jinzhao, juzhang, mrezanin, virt-maint | |
Version: | 8.2 | |||
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1801081 1801542 (view as bug list) | Environment: | ||
Last Closed: | 2020-11-17 17:46:36 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1801081, 1801542 |
Description
Pei Zhang
2020-01-21 06:57:18 UTC
Hi Pei, When you say 2 queues you mean multiqueue in one vhost device (so "queues=2"), right? The issue has nothing to do with having two vhost devices right? I have a feeling this is also one of the issues that triggers BZ 1788415. Anyhow, it's good that we have a separate BZ for it. My bet is that the issue starts occurring when the following commit was introduced in DPDK: commit 761d57651c51365354cefb624883fccf62aee67d Author: Tiwei Bie <tiwei.bie> Date: Thu Sep 5 19:01:25 2019 +0800 vhost: fix slave request fd leak We need to close the old slave request fd if any first before taking the new one. Fixes: 275c3f944730 ("vhost: support slave requests channel") Cc: stable Signed-off-by: Tiwei Bie <tiwei.bie> Reviewed-by: Maxime Coquelin <maxime.coquelin> But that is not the issue. The problem is in qemu. When multiqueue is enabled, qemu will open one slave socket per queue-pair. When that happens, DPDK will close the first socket (what would generate the errors we see). Qemu should only open one slave channel on the firt vqueue pair. I've posted a patch upstream that should fix this: https://patchwork.ozlabs.org/patch/1226778/ (In reply to Adrián Moreno from comment #1) > Hi Pei, > > When you say 2 queues you mean multiqueue in one vhost device (so > "queues=2"), right? The issue has nothing to do with having two vhost > devices right? Hi Adrian, That's right, 2 queues means "queues=2", it's multiqueue in one vhost device. > > I have a feeling this is also one of the issues that triggers BZ 1788415. > Anyhow, it's good that we have a separate BZ for it. I can re-test BZ 1788415 with this bug fix and verify if it can fix it. Thank you. Best regards, Pei QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks Verified with qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420.x86_64: No errors shows in vhost-user 2 queues test scenarios. Testcase: live_migration_nonrt_server_2Q_1G_iommu_ovs =======================Stream Rate: 1Mpps========================= No Stream_Rate Downtime Totaltime Ping_Loss moongen_Loss 0 1Mpps 231 19434 0 541974 1 1Mpps 246 18475 0 556005 2 1Mpps 251 17997 0 572175 3 1Mpps 224 17850 0 515903 Max 1Mpps 251 19434 0 572175 Min 1Mpps 224 17850 0 515903 Mean 1Mpps 238 18439 0 546514 Median 1Mpps 238 18236 0 548989 Stdev 0 12.61 714.98 0.0 23848.21 Testcase: live_migration_nonrt_server_2Q_1G_iommu_cross_numa_pvp No Stream_Rate Downtime Totaltime Ping_Loss moongen_Loss 0 1Mpps 233 18876 0 529292 1 1Mpps 218 18484 0 503991 2 1Mpps 259 19135 0 584834 3 1Mpps 197 18441 0 455816 Max 1Mpps 259 19135 0 584834 Min 1Mpps 197 18441 0 455816 Mean 1Mpps 226 18734 0 518483 Median 1Mpps 225 18680 0 516641 Stdev 0 26.1 331.32 0.0 53716.73 Testcase: nfv_acceptance_nonrt_server_2Q_1G_iommu Packets_loss Frame_Size Run_No Throughput Avg_Throughput 0 64 0 21.307340 21.30734 Testcase: pvp_performance_nonrt_server_2Q_iommu Packets_loss Frame_Size Run_No Throughput Avg_Throughput 0 64 0 20.833873 20.833863 0 64 1 20.833853 20.833863 ==Testing details info== Testcase: live_migration_nonrt_server_2Q_1G_iommu_ovs PASS Testcase: live_migration_nonrt_server_2Q_1G_iommu_cross_numa_pvp PASS Testcase: nfv_acceptance_nonrt_server_2Q_1G_iommu PASS Testcase: pvp_performance_nonrt_server_2Q_iommu PASS So this bug has been fixed well. Will move to 'VERIFIED' once on_qa. Move to Verified as Comment 8. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5137 |