RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2024367 - i386/pc: Fix creation of >= 1Tb guests on AMD systems with IOMMU
Summary: i386/pc: Fix creation of >= 1Tb guests on AMD systems with IOMMU
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.6
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Igor Mammedov
QA Contact: Yanghang Liu
URL:
Whiteboard:
Depends On: 1982898 1983208
Blocks: 1906150
TreeView+ depends on / blocked
 
Reported: 2021-11-17 22:54 UTC by Terry Bowman (AMD)
Modified: 2022-01-06 10:19 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1983208
Environment:
Last Closed: 2022-01-06 08:53:26 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-103095 0 None None None 2021-11-17 22:56:08 UTC

Description Terry Bowman (AMD) 2021-11-17 22:54:57 UTC
+++ This bug was initially created as a clone of Bug #1983208 +++

+++ This bug was initially created as a clone of Bug #1982898 +++

Short Description:
This series lets Qemu properly spawn i386 guests with >= 1Tb with VFIO, particularly when running on AMD systems with an IOMMU.

Upstream Patches (RFC):
https://lore.kernel.org/qemu-devel/20210622154905.30858-1-joao.m.martins@oracle.com/

Description of problem:

Since Linux v5.4, VFIO validates whether the IOVA in DMA_MAP ioctl is valid and it
will return -EINVAL on those cases. On x86, Intel hosts aren't particularly
affected by this extra validation. But AMD systems with IOMMU have a hole in
the 1TB boundary which is *reserved* for HyperTransport I/O addresses located
here  FD_0000_0000h - FF_FFFF_FFFFh. See IOMMU manual [1], specifically
section '2.1.2 IOMMU Logical Topology', Table 3 on what those addresses mean.

VFIO DMA_MAP calls in this IOVA address range fall through this check and hence return
 -EINVAL, consequently failing the creation the guests bigger than 1010G. Example
of the failure:

qemu-system-x86_64: -device vfio-pci,host=0000:41:10.1,bootindex=-1: VFIO_MAP_DMA: -22
qemu-system-x86_64: -device vfio-pci,host=0000:41:10.1,bootindex=-1: vfio 0000:41:10.1: 
	failed to setup container for group 258: memory listener initialization failed:
		Region pc.ram: vfio_dma_map(0x55ba53e7a9d0, 0x100000000, 0xff30000000, 0x7ed243e00000) = -22 (Invalid argument)

Prior to v5.4, we could map using these IOVAs *but* that's still not the right thing
to do and could trigger certain IOMMU events (e.g. INVALID_DEVICE_REQUEST), or
spurious guest VF failures from the resultant IOMMU target abort (see Errata 1155[2])
as documented on the links down below.

This series tries to address that by dealing with this AMD-specific 1Tb hole,
similarly to how we deal with the 4G hole today in x86 in general. 


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from John Ferlan on 2021-07-26 19:21:37 UTC ---

Assigned to Amnon for initial triage per bz process and age of bug created or assigned to virt-maint without triage.

Take care to resolve the cloned to bug 1982898 as well

Comment 1 Jon Grimm 2021-11-18 03:23:25 UTC
I emailed Joao (upstream developer that had submitted RFC patches) to see if still working on the issue. I don't see anything new on qemu-devel.

Comment 2 Dr. David Alan Gilbert 2021-11-18 09:29:57 UTC
(In reply to Jon Grimm from comment #1)
> I emailed Joao (upstream developer that had submitted RFC patches) to see if
> still working on the issue. I don't see anything new on qemu-devel.

Thanks

Comment 3 John Ferlan 2021-11-23 19:52:12 UTC
Igor - since you own the cloned from bug 1982898 and bug 1983208 - I'll assign direct to you

Comment 5 Igor Mammedov 2022-01-06 08:53:26 UTC
I'm expecting changes to be too invasive and risky for 8.5 and too late for 8.6
if it's ever implemented upstream.
So I'm going to close it for 8.5 and 8.6 and leave only postponed 9.0 one open.

Comment 6 Yanghang Liu 2022-01-06 10:19:06 UTC
Thanks Igor for the info.


Note You need to log in before you can comment on or make changes to this bug.