This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2084533 - [RFE] qemu should communicate available physical address space to the guest (firmware).
Summary: [RFE] qemu should communicate available physical address space to the guest (...
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: edk2
Version: 9.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Gerd Hoffmann
QA Contact: Xueqiang Wei
URL:
Whiteboard:
Depends On: 2024818
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-12 11:45 UTC by Gerd Hoffmann
Modified: 2022-12-14 14:37 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2024818
Environment:
Last Closed: 2022-12-13 11:53:55 UTC
Type: Feature Request
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker   RHEL-60 0 None None None 2022-12-14 14:37:38 UTC
Red Hat Issue Tracker RHELPLAN-121834 0 None None None 2022-05-12 11:49:30 UTC

Comment 2 Gerd Hoffmann 2022-05-12 11:54:15 UTC
State of affairs:

qemu has a switch (-cpu ${name},host-phys-bits={on,off}).

When enabled (downstream default, also forced for microvm machine type upstream) standard cpuid query returns a physical address space size which is guaranteed to work (matches host by default, can be smaller, can not me larger).

When disabled (upstream default for q35 and pc) standard cpuid query returns whatever is configured (40 by default), even if the host supports less than that.

The problem the firmware has is that it can't trust the standard cpuid to return a usable value so the physical address space is not known.  Today the firmware tries to handle that by being conservative and not exceed the minimum size (phys-bits=36 aka 64G), which becomes increasingly problematic.

Comment 3 Gerd Hoffmann 2022-05-12 11:57:21 UTC
Two possible solutions I see for that:

(a) create a _reliable_ way for the guest to query the physical address space which is available.
(b) communicate the state of the host-phys-bits switch to the guest somehow, so the guest knows whenever standard cpuid query can be trusted or not.

Comment 5 Gerd Hoffmann 2022-05-17 12:07:58 UTC
> Today the
> firmware tries to handle that by being conservative and not exceed the
> minimum size (phys-bits=36 aka 64G), which becomes increasingly problematic.

For example when pci-assigning a gpu with a 32G memory bar (Cc'ing Guo Zhiyi).
Today the only option we have is to manually configure the firmware and override
the conservative defaults (-fw_cfg name=opt/ovmf/X-PciMmio64Mb,string=65536).

Comment 6 Dr. David Alan Gilbert 2022-05-18 08:27:14 UTC
Would this be fixed if the non-hostphysbits case some how told you min(configured, actual) ?

Comment 7 Gerd Hoffmann 2022-05-18 11:25:07 UTC
(In reply to Dr. David Alan Gilbert from comment #6)
> Would this be fixed if the non-hostphysbits case some how told you
> min(configured, actual) ?

Yes.

We have that for the host-phys-bits=on case too, via host-phys-bits-limit.  You can configure the phys-bits presented to the guest.  It's required to be lower or equal the host value.  Mostly interesting for life migration compatibility, so one can configure the lowest value of the host pool.

So providing min(configured, actual) to the firmeare for the host-phys-bits=off case would be quite simliar, except for qemu not throwing an error in case the configured (or default) phys-bits are larger than actual.

Comment 9 Nitesh Narayan Lal 2022-05-27 20:46:27 UTC
Thanks, Julia.
In that case, we have to find someone who could take this up and move it forward otherwise we will have to keep it in the backlog.
Adding Igor to keep him in the loop.

Comment 10 Nitesh Narayan Lal 2022-05-31 18:40:33 UTC
Marking this as Triaged and keeping it in the backlog for now.
If this is something that we should be prioritizing please help by commenting or increasing the priority.

In the meantime, if anyone is interested please feel free to pick this up.

Comment 12 Gerd Hoffmann 2022-09-07 12:28:22 UTC
jira issue for firmware changes
https://issues.redhat.com/browse/RHELX-60

Comment 15 Nitesh Narayan Lal 2022-09-08 12:58:10 UTC
Hi Gerd, a couple of questions here.
- Would you like to own this BZ? Even if the patches comes via rebase after getting picked upstream we will need an assignee for any QE follow ups.
- Do we have a BZ tracking the kernel change?
Thanks

Comment 16 Gerd Hoffmann 2022-09-08 13:56:59 UTC
(In reply to Nitesh Narayan Lal from comment #15)
> Hi Gerd, a couple of questions here.
> - Would you like to own this BZ? Even if the patches comes via rebase after
> getting picked upstream we will need an assignee for any QE follow ups.

Yes, can take it.

> - Do we have a BZ tracking the kernel change?

Not needed, no kernel code changes.  It's really just reserving the bit so
it isn't grabbed for something else and upstream kernel repo is the place
where we have to do that because the master copy of the kvm header files
lives there.  Downstream kernel doesn't matter.

Comment 17 Gerd Hoffmann 2022-09-23 06:27:23 UTC
Ok, seems we have a plan now which doesn't require host (kernel/qemu) changes.

Quoting patch discussions:

> > Intel processors that are not extremely old have host-phys-bits equal
> > to 39, 46 or 52. Older processors that had 36, in all likelihood,
> > didn't have IOMMUs (so no big 64-bit BARs).
> >
> > AMD processors have had 48 for a while, though older consumer processors
> > had 40.
>
> How reliable is the vendorid?

Pretty reliable. In principle it can be changed, but there's no good reason
to do it (especially in a long lived VM) and it requires manual command
line intervention.

> Given newer processors have more than 40 and for older ones we know
> the possible values for the two relevant x86 vendors we could do
> something along the lines of:
>
>    phys-bits >= 41                   -> valid
>    phys-bits == 40    + AuthenticAMD -> valid
>    phys-bits == 36,39 + GenuineIntel -> valid
>    everything else                   -> invalid
>
> Does that look sensible to you?

Yes, it does!

Comment 18 Gerd Hoffmann 2022-11-22 13:48:43 UTC
No qemu changes needed-
edk2 progress is tracked here: https://issues.redhat.com/browse/RHEL-60
current state: merged upstream, stream9 / rhel-9.2 will get it with the next edk2 rebase.


Note You need to log in before you can comment on or make changes to this bug.