Bug 1699448

Summary: Fail to launch AMD SEV VM with assigned PCI device
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Eduardo Habkost <ehabkost>
Component: qemu-kvmAssignee: Gary R Hook (AMD) <ghook>
Status: CLOSED ERRATA QA Contact: Pei Zhang <pezhang>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.1CC: chayang, ddepaula, jinzhao, juzhang, knoel, virt-maint, zhguo
Target Milestone: rcKeywords: TestOnly
Target Release: 8.1Flags: knoel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-4.0.0-3.module+el8.1.0+3265+26c4ed71 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-06 07:14:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eduardo Habkost 2019-04-12 18:17:35 UTC
This bug was initially created as a copy of Bug #1667249

I am copying this bug because: 
The same fix must be fixed in RHEL-AV 8.1.0 to avoid regressions.


Description of problem:

On an AMD SEV enabled host with an SEV enabled guest, attaching an assigned device to the VM results in a failure to start the VM:

qemu-kvm: -device vfio-pci,host=01:00.0,id=hostdev0,bus=pci.2,addr=0x0: sev_ram_block_added: failed to register region (0x7fd96e6bb000+0x20000) error 'Cannot allocate memory'

In this case the assigned device is a simple Intel 82574L NIC:

01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
	Subsystem: Intel Corporation Gigabit CT Desktop Adapter
	Flags: bus master, fast devsel, latency 0, IRQ 89, NUMA node 0
	Memory at fb9c0000 (32-bit, non-prefetchable) [size=128K]
	Memory at fb900000 (32-bit, non-prefetchable) [size=512K]

Note the error indicates the region as (base+size) where a size of 0x20000 is 128K, which matches that of BAR0 for the device.  dmesg on the host also reports:

SVM: SEV: Failure locking 32 pages.

Further analysis shows that SEV guests make use of the RAMBlock notifier in QEMU to add page pinnings for SEV where the kernel side of the call only knows how to pin pages with get_user_pages() which faults on non-page backed mappings such as the mmap of an MMIO BAR.

Consulting with AMD, Brijesh is aware of this issue an intends to post patches upstream to make the ram_device flag of the region visible to the SEV code such that these regions can be skipped in QEMU.

Version-Release number of selected component (if applicable):
kernel-4.18.0-61.el8.x86_64
qemu-kvm-2.12.0-57.module+el8+2683+02b3b955.x86_64

How reproducible:
100%

Steps to Reproduce:
1. add a vfio-pci assigned device to a host and VM previously configured with SEV support enabled
2.
3.

Actual results:
VM fails to launch

Expected results:
VM works normally

Additional info:

Comment 4 Eduardo Habkost 2019-04-13 16:35:21 UTC
Upstream commits are:

2ddb89b00f94 memory: Fix the memory region type assignment order       v4.0.0-rc2~11^2~19
cedc0ad539af target/i386: sev: Do not pin the ram device memory region v4.0.0-rc2~11^2~18

Comment 5 Danilo de Paula 2019-06-04 23:23:47 UTC
qemu-kvm-4.0.0-3.module+el8.1.0+3265+26c4ed71

Comment 6 Danilo de Paula 2019-06-04 23:28:40 UTC
Moving to ON_QA for testing.

Comment 7 Pei Zhang 2019-07-09 09:34:14 UTC
== Steps:
1. Boot VM with device assignment and SEV
/usr/libexec/qemu-kvm \
-enable-kvm \
-cpu EPYC \
-smp 4 \
-m 4G \
-object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1 \
-machine q35,memory-encryption=sev0 \
-drive if=pflash,format=raw,unit=0,file=/usr/share/edk2/ovmf/sev/OVMF_CODE.secboot.fd,readonly \
-drive if=pflash,format=raw,unit=1,file=/usr/share/edk2/ovmf/sev/OVMF_VARS.fd \
-device pcie-root-port,id=root.1,chassis=1 \
-device pcie-root-port,id=root.2,chassis=2 \
-device pcie-root-port,id=root.3,chassis=3 \
-device pcie-root-port,id=root.4,chassis=4 \
-device pcie-root-port,id=root.5,chassis=5 \
-device virtio-scsi-pci,iommu_platform=on,id=scsi0,bus=root.1,addr=0x0 \
-drive file=/home/sev_guest.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0 \
-device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scssi0-0-0-0,bootindex=1 \
-netdev tap,id=hostnet0,vhost=off \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=18:66:da:57:dd:03,bus=root.2,iommu_platform=true \
-vnc :0 \
-monitor stdio \
-serial unix:/tmp/console,server,nowait \
-device vfio-pci,host=0000:e3:00.0,bus=root.3 \
-device vfio-pci,host=0000:e3:00.1,bus=root.4 \


== Reproduced with qemu-kvm-3.1.0-20.module+el8+2904+e658c755.x86_64:

After step1, qemu quit with error:
(qemu) qemu-kvm: -device vfio-pci,host=0000:e3:00.0,bus=root.3: sev_ram_block_added: failed to register region (0x7f7b3c200000+0x1000000) error 'Cannot allocate memory'


So this bug has been reproduced.

== Verified with qemu-kvm-4.0.0-5.module+el8.1.0+3622+5812d9bf.x86_64:

After step1, guest can boot successfully. 

So this bug has been fixed very well, move to 'VERIFIED'.

Comment 9 errata-xmlrpc 2019-11-06 07:14:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723