RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1689202 - RFE: check limit on number of SEV guests
Summary: RFE: check limit on number of SEV guests
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: libvirt
Version: 8.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Daniel Berrangé
QA Contact: Luyao Huang
URL:
Whiteboard:
: 1689195 (view as bug list)
Depends On:
Blocks: 1689195 1937634
TreeView+ depends on / blocked
 
Reported: 2019-03-15 12:09 UTC by Daniel Berrangé
Modified: 2022-05-10 13:25 UTC (History)
18 users (show)

Fixed In Version: libvirt-8.0.0-0rc1.1.module+el8.6.0+13853+e8cd34b9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-10 13:18:34 UTC
Type: Feature Request
Target Upstream Version: 8.0.0
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2022:1759 0 None None None 2022-05-10 13:20:07 UTC

Description Daniel Berrangé 2019-03-15 12:09:05 UTC
Description of problem:
It was said that when using SEV, there is a limit of 15 guests that are able to use the feature concurrently.

https://www.redhat.com/archives/libvir-list/2019-January/msg00652.html

This kind of limit is important for a mgmt application to know about, so that it can place VMs on hosts which have suitable resource available, but this is not yet exposed anywhere.

Libvirt thus wants to report it, and would like to obtain this info from QEMU, which can then presumable ask the kernel for it. This might in turn require a kernel RFE if the info isn't already available in some way.

Version-Release number of selected component (if applicable):
qemu-kvm-3.1.0-11.el8

How reproducible:
Always

Steps to Reproduce:
1. Connect to a qmp console and run "query-sev-capabilities" command
2.
3.

Actual results:
No information on the guest limit is reported

Expected results:
The guest limit is reported

Additional info:

Comment 1 Daniel Berrangé 2019-03-15 16:05:27 UTC
Turns out the limit can be obtained from CPUID results so doesn't need kernel support:

  "We can query the limit through the CPUID Fn0x8000_001f[ECX]."

https://www.redhat.com/archives/libvir-list/2019-March/msg01086.html

Comment 2 Laszlo Ersek 2019-04-08 14:26:14 UTC
(In reply to Daniel Berrange from comment #1)
> Turns out the limit can be obtained from CPUID results so doesn't need
> kernel support:
>
>   "We can query the limit through the CPUID Fn0x8000_001f[ECX]."
>
> https://www.redhat.com/archives/libvir-list/2019-March/msg01086.html

sev_get_capabilities() already does

    host_cpuid(0x8000001F, 0, NULL, &ebx, NULL, NULL);
    cap->cbitpos = ebx & 0x3f;

for determining the "C-bit location in page table entry" (see
@SevCapability in "qapi/target.json").

Hopefully it's a "straightforward" [*] addition to the above
host_cpuid() function call -- retrieve ECX too, and propagate the
information in a new field of @SevCapability.

[*] famous last words

--*--

... According to "devel/qapi-code-gen.txt", section "Struct types":

> On output structures (only mentioned in the 'returns' side of a
> command), [...] [c]hanging from optional to mandatory is safe.

and introducing a new (optional or mandatory) field in an output-only
struct is a special case of the above (it's like an optional field that
was never actually produced before that we're now making mandatory).

--*--

Brijesh: is ECX *guaranteed* to contain the needed information, even
with the earliest shipped version of SEV?

... AMD publication #55766 (rev 3.07) "Secure Encrypted Virtualization
API Version 0.17" <https://developer.amd.com/sev/> states,

> 6.17 ACTIVATE
> 6.17.1 Actions
>
> [...]
>
> If the guest is SEV-ES enabled, then the ASID must be at least 1h and
> at most (MIN_SEV_ASID- 1). If the guest is not SEV-ES enabled, then
> the ASID must be at least MIN_SEV_ASID and at most the maximum SEV
> ASID available. The MIN_SEV_ASID value is discovered by CPUID
> Fn8000_001F[EDX]. The maximum SEV ASID available is discovered by
> CPUID Fn8000_001F[ECX].

Based on this, it looks like we can get the simultaneous max guest
*count* from (ECX-EDX+1) -- however, that only applies to non-SEV-ES
guests.

I wonder if both the SEV-ES and non-SEV-ES guest counts should be
exposed at once in the new @SevCapability fields. What's the right level
of abstraction here? Let QEMU compute the differences and only give
those differences (i.e. counts) to libvirtd, or let QEMU expose the
actual ASID limits to libvirtd, and let libvirtd calculate the counts?

Based on the other fields in @SevCapability, I'd say we're already
exposing pretty low-level / raw details (such as "cbitpos"), so maybe we
should propagate MIN_SEV_ASID and MAX_SEV_ASID to libvirtd transparently
too.

Comment 3 Brijesh Singh 2019-04-08 15:36:01 UTC
Fn8000_001F[ECX] will contain the maximum number of ASID's in current and future hardware. So it's safe to use this in kernel and Qemu to calculate the maximum number of SEV guest. I am thinking in SEV-ES patches we will extend the qemu capabilities to expose the following new fields:

SEV Capabilities {
...
...  
  sev-es = true (bool)
  sev-es-max-guest = n (maximum num of simultaneous SEV-ES guest) [EDX]
  sev-max-guest = n    (maxium number of simultaneous SEV Guest) [ECX-sev-es-max-guest+1]
...
...
}

Comment 4 Laszlo Ersek 2019-04-08 16:32:33 UTC
Hi Brijesh,

(In reply to BRIJESH SINGH from comment #3)

> I am thinking in SEV-ES patches we will
> extend the qemu capabilities to expose the following new fields:
> 
> SEV Capabilities {
> ...
> ...  
>   sev-es = true (bool)
>   sev-es-max-guest = n (maximum num of simultaneous SEV-ES guest) [EDX]
>   sev-max-guest = n    (maxium number of simultaneous SEV Guest)
> [ECX-sev-es-max-guest+1]
> ...
> ...
> }

thanks for the info. Does that mean you are already tracking this feature (including the "sev-max-guest" field) in some upstream tracker item (or patch set even)?

Because then I could make this RHBZ dependent on that upstream tracker, and the RHBZ scope would be reduced to backporting AMD's upstream patches. Thanks.

Comment 5 Brijesh Singh 2019-04-08 18:18:32 UTC
Hi Laszlo,

I don't have BZ for this. Sometime back we had a discussion about this on libvirt ML and I created TODO task in our internal tracker. If you want then I can submit sev-max-guest field patch in QEMU. I am okay if you submit the patch and copy me for Ack. Let me know whatever works for you. The other fields can be added when we submit the SEV-ES patches.

thanks

Comment 6 Laszlo Ersek 2019-04-09 08:21:18 UTC
Hi Brijesh,

my plate is pretty full, and due to this change modifying a QAPI schema, I expect the upstream patch to go up to v2 at the least, with the review round-trip time that's usual for QEMU. If you can squeeze the upstream patch into your workload, so that I only have backport the upstream commit to RHEL8, I'd prefer that.

Thanks!
Laszlo

Comment 7 Brijesh Singh 2019-04-09 13:22:22 UTC
Hi Laszlo,

I will submit the patch this week (probably tomorrow) and copy you.

thanks

Comment 8 Brijesh Singh 2019-04-11 18:01:36 UTC
v1 posted https://patchwork.kernel.org/patch/10896557/

Comment 10 Laszlo Ersek 2019-05-07 09:41:15 UTC
(In reply to Laszlo Ersek from comment #9)
> v2: http://mid.mail-archive.com/20190411235456.12918-1-brijesh.singh@amd.com

This QEMU patch was straight-forward, and got quickly approved by multiple reviewers; however, Paolo preferred libvirtd to incorporate the check directly.

I'm moving this BZ to libvirtd then.

Comment 13 Jaroslav Suchanek 2020-10-15 12:55:12 UTC
*** Bug 1689195 has been marked as a duplicate of this bug. ***

Comment 16 Dr. David Alan Gilbert 2021-03-16 10:42:40 UTC
Note that the 2nd gen and onwards AMD hardware has much higher limits, not the 15 on the 1st gen; so this is less of an issue.
(And on RHEL we don't support 1st gen for SEV).
Note however that there tend to be BIOS settings to set the number of SEV ASIDs available, so you can run into the limits
due to config.

Comment 17 John Ferlan 2021-09-09 16:05:54 UTC
Bulk update: Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release.

Comment 18 RHEL Program Management 2021-09-15 08:26:11 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 21 John Ferlan 2022-01-06 14:57:54 UTC
Tyler - sending this your way as something to implement within libvirt. I figure this would be something libvirt should display via the virsh domcapabilities output. I'm sure Daniel can elaborate more.

Comment 22 Peter Krempa 2022-01-06 15:06:04 UTC
Note that this is already implemented in libvirt upstream:

commit 7826148a72c97367fc6aaa76397fe92d32169723
Author: Daniel P. Berrangé <berrange>
Date:   Wed Dec 8 14:53:45 2021 -0500

    qemu: report max number of SEV guests
    
    Different CPU generations have different limits on the number
    of SEV/SEV-ES guests that can be run. Since both limits come
    from the same overall set, there is typically also BIOS config
    to set the tradeoff betweeen SEV and SEV-ES guest limits.
    
    This is important information to expose for a mgmt application
    scheduling guests to hosts.
    
    Reviewed-by: Peter Krempa <pkrempa>
    Signed-off-by: Daniel P. Berrangé <berrange>

Comment 23 John Ferlan 2022-01-06 21:47:14 UTC
Thanks Peter - I'll reassign to Daniel, add this to 8.6.0 w/ devel_ack+ and let Daniel do the libvirt & POST magic. I assume it'll be libvirt-8.0.0

Comment 27 Luyao Huang 2022-01-21 07:10:33 UTC
Verify this bug with libvirt-daemon-8.0.0-1.module+el8.6.0+13888+55157bfb.x86_64:

1. prepare a EPYC-Milan system and enable SEV and SEV-ES

# lscpu
...
BIOS Vendor ID:      AMD
CPU family:          25
Model:               1
...

# dmesg |grep SEV
[ 1896.000535] SEV supported: 508 ASIDs
[ 1896.004122] SEV-ES supported: 1 ASIDs

2. check virsh domcapabilities sev support status:

# virsh domcapabilities
...
    <sev supported='yes'>
      <cbitpos>51</cbitpos>
      <reducedPhysBits>1</reducedPhysBits>
      <maxGuests>508</maxGuests>
      <maxESGuests>1</maxESGuests>
    </sev>
...

3. try to start 2 (> maxESGuests) sev-es guests and get error:

# virsh start vm2
error: Failed to start domain 'vm2'
error: internal error: process exited while connecting to monitor: 2022-01-21T06:56:49.488273Z qemu-kvm: -accel kvm: sev_kvm_init: failed to initialize ret=-16 fw_error=0 ''
2022-01-21T06:56:49.488454Z qemu-kvm: -accel kvm: failed to initialize kvm: Operation not permitted


And test on a EPYC host which not support sev-es

# virsh domcapabilities
...
    <sev supported='yes'>
      <cbitpos>47</cbitpos>
      <reducedPhysBits>1</reducedPhysBits>
      <maxGuests>15</maxGuests>
      <maxESGuests>0</maxESGuests>
    </sev>
...

And test on a host which not enable sev

# virsh domcapabilities
...
    <sev supported='no'/>
...

Comment 30 errata-xmlrpc 2022-05-10 13:18:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759


Note You need to log in before you can comment on or make changes to this bug.