Bug 2196289
Summary: | Fix number of ready channels on multifd | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Juan Quintela <quintela> |
Component: | qemu-kvm | Assignee: | Leonardo Bras <leobras> |
qemu-kvm sub component: | Live Migration | QA Contact: | Li Xiaohui <xiaohli> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | chayang, jinzhao, juzhang, leobras, nilal, peterx, quintela, virt-maint |
Version: | 9.3 | Keywords: | Triaged |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-8.0.0-5.el9 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-11-07 08:27:35 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Juan Quintela
2023-05-08 15:59:37 UTC
There is an uptodate patchset to fix this issue: https://lists.gnu.org/archive/html/qemu-devel/2023-04/msg04562.html Upstream commit: commit d2026ee117147893f8d80f060cede6d872ecbd7f Author: Juan Quintela <quintela> Date: Wed Apr 26 12:20:36 2023 +0200 multifd: Fix the number of channels ready Discuss the reproduction steps through gchat with Juan: 1. set a small network like 1MB/s for migration bandwidth; 2. set very few multifd channels (1-2); Before the fix, we would see the main migration thread is busy waiting, i.e. CPU = 100% After fix, the cpu usage of the main migration thread should be small I would test following the above steps before and after the fix. Thank you Juan. Hi Leonardo, What's our fixed plan for this bug? I see the ITR is set to RHEL 9.3.0. Can you help set a proper DTM? QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Extend ITM to 20 as the reproduction steps is not clear, I need more time to test and get confirmation from the Juan / Leonardo Hi all, I did some tests on qemu-kvm-8.0.0-1.el9.x86_64 and qemu-kvm-8.0.0-7.el9.x86_64 Test steps: 1. Enable multifd capability and set multifd channel to 1 for src and dst host; 2. Set 0 (no limitation) for migration bandwidth; 3. Run stressapptest in VM; # stressapptest -M 10000 -s 1000000 4. Start to migrate VM from src to dst host Note: the nic support 200G bandwidth. Before the fix (qemu-kvm-8.0.0-1.el9.x86_64), see the cpu usage of the main migration thread is busy on the src host: live_migration thread is 85.0%, but multifdsend thread is 17.9% after 1 second: live_migration thread changes to 18.0%, multifdsend thread changes to 4.0% After fix (qemu-kvm-8.0.0-7.el9.x86_64), the cpu usage of the main migration thread is small: live_migration thread is 8.3%, multifdsend thread is 9.7% Also test the above scenario (but set multifd channel to 10) on qemu-kvm-8.0.0-7.el9.x86_64, the cpu usage is like below: live migration thread is 9.3%, 4 multifdsend threads are 2.3%, 3 multifdsend threads are 2.0%, 3 multifdsend threads are 1.7% Per above test results, I think we can mark this bug verified. Juan, Leonardo, how do you think? It looks correct. Thanks very much. Thanks for the reivew. Mark bug verified per Comment 11 and Comment 12. I would add one case to monitor cpu usage of the live migration thread and the multifdsend threads later. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6368 |