Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1982898

Summary: i386/pc: Fix creation of >= 1Tb guests on AMD systems with IOMMU
Product: Red Hat Enterprise Linux 8 Reporter: Terry Bowman (AMD) <tbowman>
Component: qemu-kvmAssignee: Igor Mammedov <imammedo>
qemu-kvm sub component: CPU Models QA Contact: Yanghang Liu <yanghliu>
Status: CLOSED DEFERRED Docs Contact:
Severity: unspecified    
Priority: unspecified CC: coli, ctatman, darcari, imammedo, jinzhao, jon.grimm, juzhang, mdean, pradeepvineshreddy.kodamati, suravee.suthikulpanit, terry.bowman, virt-maint, wei.huang2
Version: 8.5Keywords: Triaged
Target Milestone: betaFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1983208 (view as bug list) Environment:
Last Closed: 2022-01-06 08:50:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1935445, 1983208, 2024367    

Description Terry Bowman (AMD) 2021-07-15 23:57:11 UTC
Short Description:
This series lets Qemu properly spawn i386 guests with >= 1Tb with VFIO, particularly when running on AMD systems with an IOMMU.

Upstream Patches (RFC):
https://lore.kernel.org/qemu-devel/20210622154905.30858-1-joao.m.martins@oracle.com/

Description of problem:

Since Linux v5.4, VFIO validates whether the IOVA in DMA_MAP ioctl is valid and it
will return -EINVAL on those cases. On x86, Intel hosts aren't particularly
affected by this extra validation. But AMD systems with IOMMU have a hole in
the 1TB boundary which is *reserved* for HyperTransport I/O addresses located
here  FD_0000_0000h - FF_FFFF_FFFFh. See IOMMU manual [1], specifically
section '2.1.2 IOMMU Logical Topology', Table 3 on what those addresses mean.

VFIO DMA_MAP calls in this IOVA address range fall through this check and hence return
 -EINVAL, consequently failing the creation the guests bigger than 1010G. Example
of the failure:

qemu-system-x86_64: -device vfio-pci,host=0000:41:10.1,bootindex=-1: VFIO_MAP_DMA: -22
qemu-system-x86_64: -device vfio-pci,host=0000:41:10.1,bootindex=-1: vfio 0000:41:10.1: 
	failed to setup container for group 258: memory listener initialization failed:
		Region pc.ram: vfio_dma_map(0x55ba53e7a9d0, 0x100000000, 0xff30000000, 0x7ed243e00000) = -22 (Invalid argument)

Prior to v5.4, we could map using these IOVAs *but* that's still not the right thing
to do and could trigger certain IOMMU events (e.g. INVALID_DEVICE_REQUEST), or
spurious guest VF failures from the resultant IOMMU target abort (see Errata 1155[2])
as documented on the links down below.

This series tries to address that by dealing with this AMD-specific 1Tb hole,
similarly to how we deal with the 4G hole today in x86 in general. 


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 John Ferlan 2021-07-26 19:21:50 UTC
Assigned to Amnon for initial triage per bz process and age of bug created or assigned to virt-maint without triage.

Take care to resolve the cloned to bug 1983208 as well

Comment 2 Igor Mammedov 2021-08-23 11:26:09 UTC
RHEL8 uses 4.18.0 kernel, is it really affected by this?

Comment 3 WEI HUANG 2021-08-23 19:58:11 UTC
(In reply to Igor Mammedov from comment #2)
> RHEL8 uses 4.18.0 kernel, is it really affected by this?

Looks like the answer YES. On a older kernel, the system might fail in a more subtle way due to IOMMU target abort. In comparison, newer kernel will fail explicitly due to IOVA validation failure in VFIO.

Comment 4 Yanghang Liu 2021-10-20 09:07:24 UTC
Hi Igor

May I ask if this bug will be fixed on RHEL8.6 ?

Comment 5 Igor Mammedov 2021-10-20 09:13:34 UTC
I(In reply to Yanghang Liu from comment #4)
> Hi Igor
> 
> May I ask if this bug will be fixed on RHEL8.6 

I haven't seen any followup for AMD side after the last review upstream.
So answer to your question for now is "probably no"

Comment 6 Igor Mammedov 2022-01-06 08:50:47 UTC
I'm expecting changes to be too invasive and risky for 8.5 and too late for 8.6
if it's ever implemented upstream.
So I'm going to close it for 8.5 and 8.6 and leave only postoned 9.0 one open.