Bug 2089431
Summary: | [RFE] RFE to allow enabling ZEROCOPY live migration through libvirt | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Nils Koenig <nkoenig> |
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> |
libvirt sub component: | Live Migration | QA Contact: | Fangge Jin <fjin> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | chhu, dzheng, fjin, jdenemar, lcheng, lmen, virt-maint, xuzhang |
Version: | 9.1 | Keywords: | FutureFeature, Triaged |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | libvirt-8.5.0-1.el9 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-11-15 10:04:39 UTC | Type: | Feature Request |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | 8.5.0 |
Embargoed: | |||
Bug Depends On: | 1968509 | ||
Bug Blocks: | 2089433, 2092752 |
Comment 2
Jiri Denemark
2022-06-24 12:33:22 UTC
Scenario 1(negative): parallel + zerocopy + native_tls + non-p2p # virsh migrate uefi qemu+tcp://***/system --live --postcopy --bandwidth 10 --auto-converge --zerocopy --parallel --tls error: operation failed: job 'migration out' failed: Requested Zero Copy feature is not available: Invalid argument Scenario 2(negative): non_parallel + zerocopy # virsh migrate uefi qemu+tcp://***/system --live --postcopy --bandwidth 10 --auto-converge --zerocopy --parallel --tls [--p2p] error: operation failed: job 'migration out' failed: Requested Zero Copy feature is not available: Invalid argument Scenario 3: parallel + zerocopy virsh migrate uefi qemu+tcp://dell-per640-09.lab.eng.pek2.redhat.com/system --live --postcopy --bandwidth 10 --auto-converge --zerocopy --parallel [--p2p] Scenario 4: parallel + zerocopy, abort migration, then migrate again. (In reply to Fangge Jin from comment #7) > Scenario 2(negative): non_parallel + zerocopy > # virsh migrate uefi qemu+tcp://***/system --live --postcopy --bandwidth 10 > --auto-converge --zerocopy --parallel --tls [--p2p] > error: operation failed: job 'migration out' failed: Requested Zero Copy > feature is not available: Invalid argument Correct a typo here: It should be: # virsh migrate uefi qemu+tcp://***/system --live --postcopy --bandwidth 10 --auto-converge --zerocopy --tls [--p2p] error: Requested operation is not valid: zero-copy is only available for parallel migration Hi Jirka I did more testing today and found one issue about memlock limit. Could you please help to confirm whether this is a bug? Steps: 1. Start vm, and check prlimit: # prlimit -p 38921 -l RESOURCE DESCRIPTION SOFT HARD UNITS MEMLOCK max locked-in-memory address space 134217728 134217728 bytes 2. Migrate vm, and check prlimit before migration completes: # virsh migrate uefi qemu+tcp://***/system --live --postcopy --bandwidth 10 --auto-converge --zerocopy --p2p --parallel # prlimit -p 38921 -l RESOURCE DESCRIPTION SOFT HARD UNITS MEMLOCK max locked-in-memory address space 2147483648 2147483648 bytes 3. Kill source virtqemud before migration completes, migration will fail. But prlimit is not restored: # prlimit -p 38921 -l RESOURCE DESCRIPTION SOFT HARD UNITS MEMLOCK max locked-in-memory address space 2147483648 2147483648 bytes Additional info: 1. If I abort migration by "virsh domjobabort", prlimit can be restored. (In reply to Fangge Jin from comment #9) > 3. Kill source virtqemud before migration completes, migration will fail. > But prlimit is not restored: > # prlimit -p 38921 -l > RESOURCE DESCRIPTION SOFT HARD UNITS > MEMLOCK max locked-in-memory address space 2147483648 2147483648 bytes Do you just kill the daemon or do you even start it again. If you only kill it and keep it stop, there's nothing that could restore the limit back. But if you start the daemon again and the limit still stays the same, we have a bug somewhere. The "qemu_migration: Restore original memory locking limit" commit should be handling this, but I might have missed something there... I just kill the daemon, but systemd will restart the daemon immediately and automatically. (In reply to Fangge Jin from comment #11) I just kill the daemon, but systemd will restart the daemon immediately and automatically when the daemon is killed. OK, thanks for confirming. Could you please file a separate BZ for this issue? Bug filed for issue in Comment 9 Bug 2107424 - "mem lock limit" of qemu process is not restored when kill src virtqemud during zerocopy migration. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Low: libvirt security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:8003 |