RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2107466 - zerocopy capability can be enabled when set migrate capabilities with multifd and compress/xbzrle together
Summary: zerocopy capability can be enabled when set migrate capabilities with multifd...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.1
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Leonardo Bras
QA Contact: Li Xiaohui
URL:
Whiteboard:
: 2106265 (view as bug list)
Depends On:
Blocks: 2110203
TreeView+ depends on / blocked
 
Reported: 2022-07-15 06:59 UTC by Li Xiaohui
Modified: 2022-12-28 02:53 UTC (History)
14 users (show)

Fixed In Version: qemu-kvm-7.0.0-11.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2110203 (view as bug list)
Environment:
Last Closed: 2022-11-15 09:54:42 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/src qemu-kvm merge_requests 111 0 None opened zero-copy-send fixes & improvements 2022-08-02 21:31:07 UTC
Red Hat Issue Tracker RHELPLAN-127825 0 None None None 2022-07-15 07:08:50 UTC
Red Hat Product Errata RHSA-2022:7967 0 None None None 2022-11-15 09:55:16 UTC

Description Li Xiaohui 2022-07-15 06:59:27 UTC
Description of problem:
For zerocopy, we don't support zerocopy enabled when compress is on. 
But when we set migrate capabilities multifd, zerocopy, compress together, they can succeed;
when set these capability separately, zerocopy can't be enabled under compress enabled, it's the expectation


Version-Release number of selected component (if applicable):
hosts info: hosts info: kernel-5.14.0-121.el9.x86_64 & qemu-kvm-7.0.0-8.el9.x86_64
guest info: kernel-5.14.0-125.el9.x86_64


How reproducible:
100%


Steps to Reproduce:
1.Boot a guest
2.Set multifd, zerocopy, compress capabilities on together:
{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"xbzrle","state":false},{"capability":"auto-converge","state":false},{"capability":"rdma-pin-all","state":false},{"capability":"postcopy-ram","state":false},{"capability":"compress","state":true},{"capability":"pause-before-switchover","state":false},{"capability":"late-block-activate","state":false},{"capability":"multifd","state":true},{"capability":"dirty-bitmaps","state":false},{"capability":"return-path","state":false},{"capability":"zero-copy-send","state":true}]}}
{"return": {}}


Actual results:
Same as Steps, enable compress, multifd, zerocopy capabilities together succeed


Expected results:
Can't enable zerocopy and compress capabilities together

If we enable compress, multifd, zerocopy separately, zerocopy will fail to enable:
{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"compress","state":true}]}}
{"return": {}}
{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"multifd","state":true}]}}
{"return": {}}
{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"zero-copy-send","state":true}]}}
{"error": {"class": "GenericError", "desc": "Zero copy only available for non-compressed non-TLS multifd migration"}}


Additional info:

Comment 1 Li Xiaohui 2022-07-15 07:04:30 UTC
As libvirt always set all migrate capabilities together through one qmp command "migrate-set-capabilities", we should fix this bug.

Comment 2 Li Xiaohui 2022-07-15 07:23:44 UTC
BTW, shall we also avoid to enable compress capability successfully if zerocopy enabled like:
{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"multifd","state":true}]}}
{"return": {}}
{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"zero-copy-send","state":true}]}}
{"return": {}}
{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"compress","state":true}]}}
{"return": {}}

Comment 3 Leonardo Bras 2022-07-15 20:41:12 UTC
Thanks for reporting Li Xiaohui!

I think I have an idea on why this happens, and just wrote a probable fix.

I tested on a scratch build. Could you please give it a try?
https://kojihub.stream.rdu2.redhat.com/koji/taskinfo?taskID=1295372

Comment 4 Li Xiaohui 2022-07-24 09:13:06 UTC
Thanks Leoardo to provide the scratch build in https://bugzilla.redhat.com/show_bug.cgi?id=1968509#c26.


I have tested the build (qemu-kvm-7.0.0-8.el9.leonardo202207190013.x86_64), the issues in Description, Comment 2 have been fixed, and we also don't support enable xbzrle with zerocopy enabled.


Only following question still exist, do you plan to fix it?
1.When try to migrate with tls + multifd + zerocopy (note tls certs has been set up on source and destination host)
a. if set tls creds on src and dst host firstly, then enable multifd, zerocopy, will get error prompt, it's the expectation:
{"execute": "migrate-set-parameters", "arguments": {"tls-creds": "tls0"}, "id": "hdTiagq5"}
...
{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [{"capability": "zero-copy-send", "state": true}]}, "id": "wgr8gT3T"}
{"id": "wgr8gT3T", "error": {"class": "GenericError", "desc": "Zero copy only available for non-compressed non-TLS multifd migration"}}
b. but if enable multifd, zerocopy firstly, then set tls creds, all will succeed, but when start migration, migration will fail like:
{"execute": "query-migrate", "id": "qUqoTVuL"}
{"return": {"status": "failed", "error-desc": "Requested Zero Copy feature is not available: Invalid argument"}, "id": "qUqoTVuL"}

My question: for situation b, shall we avoid set tls creds successfully when zerocopy is enabled? or give accurate error prompt than above error-desc like zerocopy is enabled, but don't support tls migration under zerocopy.

Comment 5 Leonardo Bras 2022-07-25 18:13:56 UTC
(In reply to Li Xiaohui from comment #4)
> Thanks Leoardo to provide the scratch build in
> https://bugzilla.redhat.com/show_bug.cgi?id=1968509#c26.
> 
> 
> I have tested the build (qemu-kvm-7.0.0-8.el9.leonardo202207190013.x86_64),
> the issues in Description, Comment 2 have been fixed, and we also don't
> support enable xbzrle with zerocopy enabled.

That's great!

> 
> 
> Only following question still exist, do you plan to fix it?
> 1.When try to migrate with tls + multifd + zerocopy (note tls certs has been
> set up on source and destination host)
> a. if set tls creds on src and dst host firstly, then enable multifd,
> zerocopy, will get error prompt, it's the expectation:
> {"execute": "migrate-set-parameters", "arguments": {"tls-creds": "tls0"},
> "id": "hdTiagq5"}
> ...
> {"execute": "migrate-set-capabilities", "arguments": {"capabilities":
> [{"capability": "zero-copy-send", "state": true}]}, "id": "wgr8gT3T"}
> {"id": "wgr8gT3T", "error": {"class": "GenericError", "desc": "Zero copy
> only available for non-compressed non-TLS multifd migration"}}
> b. but if enable multifd, zerocopy firstly, then set tls creds, all will
> succeed, but when start migration, migration will fail like:
> {"execute": "query-migrate", "id": "qUqoTVuL"}
> {"return": {"status": "failed", "error-desc": "Requested Zero Copy feature
> is not available: Invalid argument"}, "id": "qUqoTVuL"}
> 
> My question: for situation b, shall we avoid set tls creds successfully when
> zerocopy is enabled? or give accurate error prompt than above error-desc
> like zerocopy is enabled, but don't support tls migration under zerocopy.

That's odd. 
I specifically added a test for checking zero-copy enabled & tls_creds when setting a parameter (migrate_params_check()), and it should output the same error message.
I will try to do some debugging on that, and see what could be going wrong.

Comment 6 Leonardo Bras 2022-07-26 01:33:29 UTC
(In reply to Leonardo Bras from comment #5)
> > My question: for situation b, shall we avoid set tls creds successfully when
> > zerocopy is enabled? or give accurate error prompt than above error-desc
> > like zerocopy is enabled, but don't support tls migration under zerocopy.
> 
> That's odd. 
> I specifically added a test for checking zero-copy enabled & tls_creds when
> setting a parameter (migrate_params_check()), and it should output the same
> error message.
> I will try to do some debugging on that, and see what could be going wrong.

I think I found the error, and that is something related to the parameter struct initialization.
It looks like it loads TLS data, even though it says it's not enabled, so the tests for "enable zero-copy" -> "enable tls" will not fail, even though it was desired.

I sent a fix to the mailing list, and I will provide the brew / backport MR as soon as I get some feedback.

Comment 7 John Ferlan 2022-08-01 18:22:22 UTC
Considering the RHEL 8.7.0 cloned bug 2110203 has been posted downstream, I've added the ITR=9.1.0 here as we'll need to fix this in RHEL9 too.

Comment 9 Yanan Fu 2022-08-16 06:57:05 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 12 Li Xiaohui 2022-08-17 09:48:26 UTC
Verify this bug according to Comment 4 on qemu-kvm-7.0.0-11.el9.x86_64, all issues have been fixed. 

Mark this bug verified per test results and remove 'SanityOnly' from 'Verified' since we have test steps to reproduce this bug

Comment 13 Li Xiaohui 2022-08-29 11:29:00 UTC
*** Bug 2106265 has been marked as a duplicate of this bug. ***

Comment 15 errata-xmlrpc 2022-11-15 09:54:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7967


Note You need to log in before you can comment on or make changes to this bug.