Bug 2107466
Summary: | zerocopy capability can be enabled when set migrate capabilities with multifd and compress/xbzrle together | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Li Xiaohui <xiaohli> | |
Component: | qemu-kvm | Assignee: | Leonardo Bras <leobras> | |
qemu-kvm sub component: | Live Migration | QA Contact: | Li Xiaohui <xiaohli> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | chayang, chdong, coli, dgilbert, fjin, jinzhao, juzhang, lcheng, leobras, lijin, mdean, peterx, quintela, virt-maint | |
Version: | 9.1 | Keywords: | Triaged | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-7.0.0-11.el9 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2110203 (view as bug list) | Environment: | ||
Last Closed: | 2022-11-15 09:54:42 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2110203 |
Description
Li Xiaohui
2022-07-15 06:59:27 UTC
As libvirt always set all migrate capabilities together through one qmp command "migrate-set-capabilities", we should fix this bug. BTW, shall we also avoid to enable compress capability successfully if zerocopy enabled like: {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"multifd","state":true}]}} {"return": {}} {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"zero-copy-send","state":true}]}} {"return": {}} {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"compress","state":true}]}} {"return": {}} Thanks for reporting Li Xiaohui! I think I have an idea on why this happens, and just wrote a probable fix. I tested on a scratch build. Could you please give it a try? https://kojihub.stream.rdu2.redhat.com/koji/taskinfo?taskID=1295372 Thanks Leoardo to provide the scratch build in https://bugzilla.redhat.com/show_bug.cgi?id=1968509#c26. I have tested the build (qemu-kvm-7.0.0-8.el9.leonardo202207190013.x86_64), the issues in Description, Comment 2 have been fixed, and we also don't support enable xbzrle with zerocopy enabled. Only following question still exist, do you plan to fix it? 1.When try to migrate with tls + multifd + zerocopy (note tls certs has been set up on source and destination host) a. if set tls creds on src and dst host firstly, then enable multifd, zerocopy, will get error prompt, it's the expectation: {"execute": "migrate-set-parameters", "arguments": {"tls-creds": "tls0"}, "id": "hdTiagq5"} ... {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [{"capability": "zero-copy-send", "state": true}]}, "id": "wgr8gT3T"} {"id": "wgr8gT3T", "error": {"class": "GenericError", "desc": "Zero copy only available for non-compressed non-TLS multifd migration"}} b. but if enable multifd, zerocopy firstly, then set tls creds, all will succeed, but when start migration, migration will fail like: {"execute": "query-migrate", "id": "qUqoTVuL"} {"return": {"status": "failed", "error-desc": "Requested Zero Copy feature is not available: Invalid argument"}, "id": "qUqoTVuL"} My question: for situation b, shall we avoid set tls creds successfully when zerocopy is enabled? or give accurate error prompt than above error-desc like zerocopy is enabled, but don't support tls migration under zerocopy. (In reply to Li Xiaohui from comment #4) > Thanks Leoardo to provide the scratch build in > https://bugzilla.redhat.com/show_bug.cgi?id=1968509#c26. > > > I have tested the build (qemu-kvm-7.0.0-8.el9.leonardo202207190013.x86_64), > the issues in Description, Comment 2 have been fixed, and we also don't > support enable xbzrle with zerocopy enabled. That's great! > > > Only following question still exist, do you plan to fix it? > 1.When try to migrate with tls + multifd + zerocopy (note tls certs has been > set up on source and destination host) > a. if set tls creds on src and dst host firstly, then enable multifd, > zerocopy, will get error prompt, it's the expectation: > {"execute": "migrate-set-parameters", "arguments": {"tls-creds": "tls0"}, > "id": "hdTiagq5"} > ... > {"execute": "migrate-set-capabilities", "arguments": {"capabilities": > [{"capability": "zero-copy-send", "state": true}]}, "id": "wgr8gT3T"} > {"id": "wgr8gT3T", "error": {"class": "GenericError", "desc": "Zero copy > only available for non-compressed non-TLS multifd migration"}} > b. but if enable multifd, zerocopy firstly, then set tls creds, all will > succeed, but when start migration, migration will fail like: > {"execute": "query-migrate", "id": "qUqoTVuL"} > {"return": {"status": "failed", "error-desc": "Requested Zero Copy feature > is not available: Invalid argument"}, "id": "qUqoTVuL"} > > My question: for situation b, shall we avoid set tls creds successfully when > zerocopy is enabled? or give accurate error prompt than above error-desc > like zerocopy is enabled, but don't support tls migration under zerocopy. That's odd. I specifically added a test for checking zero-copy enabled & tls_creds when setting a parameter (migrate_params_check()), and it should output the same error message. I will try to do some debugging on that, and see what could be going wrong. (In reply to Leonardo Bras from comment #5) > > My question: for situation b, shall we avoid set tls creds successfully when > > zerocopy is enabled? or give accurate error prompt than above error-desc > > like zerocopy is enabled, but don't support tls migration under zerocopy. > > That's odd. > I specifically added a test for checking zero-copy enabled & tls_creds when > setting a parameter (migrate_params_check()), and it should output the same > error message. > I will try to do some debugging on that, and see what could be going wrong. I think I found the error, and that is something related to the parameter struct initialization. It looks like it loads TLS data, even though it says it's not enabled, so the tests for "enable zero-copy" -> "enable tls" will not fail, even though it was desired. I sent a fix to the mailing list, and I will provide the brew / backport MR as soon as I get some feedback. Considering the RHEL 8.7.0 cloned bug 2110203 has been posted downstream, I've added the ITR=9.1.0 here as we'll need to fix this in RHEL9 too. QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Verify this bug according to Comment 4 on qemu-kvm-7.0.0-11.el9.x86_64, all issues have been fixed. Mark this bug verified per test results and remove 'SanityOnly' from 'Verified' since we have test steps to reproduce this bug *** Bug 2106265 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7967 |