Bug 2076304
Summary: | VFIO refresh to v5.18 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Alex Williamson <alex.williamson> |
Component: | kernel | Assignee: | Alex Williamson <alex.williamson> |
kernel sub component: | KVM | QA Contact: | Yanghang Liu <yanghliu> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | unspecified | ||
Priority: | medium | CC: | coli, jinzhao, juzhang, nilal, virt-maint, yanghliu, zhguo |
Version: | 9.1 | Keywords: | Triaged |
Target Milestone: | rc | ||
Target Release: | 9.1 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | kernel-5.14.0-96.el9 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-11-15 11:02:09 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2076676, 2077294 |
Description
Alex Williamson
2022-04-18 16:13:33 UTC
Hi Alex, May I ask if you have any additional suggestion for testing this bug ? Is doing regression test enough for verifying this bug ? (In reply to Yanghang Liu from comment #3) > Hi Alex, > > > May I ask if you have any additional suggestion for testing this bug ? > > > Is doing regression test enough for verifying this bug ? Yes, regression testing should be used. Ideally we'd be able to testing vGPU support, particularly SR-IOV backed vGPU, but we'll likely need a new driver drop from NVIDIA for that as there are some differences here that break the NVIDIA GRID driver build. I can provide a hacked driver for that, but obviously we need to wait for NVIDIA for an official build. NIC support, PF & VF, NVMe, and direct GPU assignment, etc should all work as previously. Thanks Pre-verify Test Test Env: host: 5.14.0-78.mr701_220418_1703.el9.x86_64 qemu-kvm-7.0.0-1.el9.x86_64 guest: 5.14.0-78.mr701_220418_1703.el9.x86_64 Win11/Win2022 Test Case: RHEL7-11384 [SR-IOV] Start a vm with a VF -- PASS RHEL7-11388 [SR-IOV] Shutdown/Reboot a vm with a VF -- PASS RHEL7-11396 [SR-IOV] Hot-unplug a VF from a vm -- PASS RHEL7-11399 [SR-IOV] Hot-plug a VF into a vm -- A existed bug RHEL7-11408 [SR-IOV] Start two virtual machines, both of which have multiple VF(s) -- PASS RHEL7-11409 [SR-IOV] Start a vm with multiple VF(s) -- PASS RHEL7-11410 [SR-IOV] Start a vm with a PF and multiple VF(s) -- PASS RHEL7-11373 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Shutdown a vm with multiple VF(s) -- PASS RHEL7-11374 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Reboot a vm with multiple VF(s) -- PASS RHEL7-11375 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Hot plug multiple VF(s) into a vm -- A existed bug RHEL7-11376 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Hot unplug multiple VF(s) from a vm -- PASS RHEL7-11377 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Start a vm with multiple VF(s) -- PASS RHEL7-11407 [SR-IOV] Start a vm with multifunction=on VF(s) -- PASS RHEL7-11395 [SR-IOV] Hot-unplug multiple VF from a vm -- PASS RHEL7-11397 [SR-IOV] Hot-plug multiple VF(s) into a vm -- A existed bug RHEL7-11428 [vfio-pf] PF sanity test -- PASS RHEL7-11430 [vfio-pf] Shutdown a vm with a PF -- PASS RHEL7-11431 [vfio-pf] Reboot a VF with a PF -- PASS RHEL7-11434 [vfio] Hot unplug a PF from a vm -- PASS RHEL7-11435 [vfio] Hot plug a PF into a vm -- A existed bug RHEL7-11413 [vfio] Start a vm with a PF which has a specified pci address -- PASS RHEL7-11427 [vfio] Release the PF from vm and rebind PF back to host -- PASS RHEL7-11439 [vfio] Start a vm with multiple PFs -- PASS RHEL7-11440 [vfio] Start a vm with virtual network interface(s) and a PF -- PASS RHEL7-11441 [vfio] Hot unplug multiple PFs from a vm -- PASS RHEL7-11442 [vfio] Start a vm with multifunction=on PFs -- PASS RHEL7-11444 [vfio] [2M/1G(x86) 16M(ppc) hugepage] basic test -- PASS RHEL7-110508 [vfio] Hot unplug and hot plug a PF after a vhost=on virtio nic is unpluged -- A existed bug The hot-plug PF/VF issue is tracked by the following two bugs: Bug 2055123 - [Q35] Failed to hot-plug a device whose membar > 2M into the vm Bug 2024818 - [Windows_vm][Q35+ OVMF] Some hot-plugged PF/VF can not find enough free resources that it can use Hi Zhiyi, Could you please help update the test result for your part so that we can pre-verify this bug ? (In reply to Yanghang Liu from comment #11) > Hi Zhiyi, > > > Could you please help update the test result for your part so that we can > pre-verify this bug ? No vfio related issues found in my regression test. Env used: Test Env: host: 5.14.0-78.mr701_220418_1703.el9.x86_64 qemu-kvm-7.0.0-1.el9.x86_64 guest: kernel-5.14.0-92.el9.x86_64 Win10 Devices tested for GPU passthrough: 1x Nvidia A100 2x Nvidia 16 NVIDIA Driver used: 512.59/GRID 14.1 RC(512.78) Devices tested for vGPU: 1x GRID A100-40C 8x NVIDIA A16-2Q 8x GRID RTX6000-3Q 4x i915-GVTg_V5_4 NVIDIA Driver used: GRID 14.1, host driver 510.75, VM driver 512.78;(patched GRID 14.0 also tested) Pre-verify this bug based on the Comment 10 and Comment 12. (In reply to Yanghang Liu from comment #10) The Regression Test Result for verifying this bug: PASS Test Env: host: 5.14.0-96.el9.x86_64 qemu-kvm-7.0.0-4.el9.x86_64 guest: 5.14.0-96.el9.x86_64 > Test Case: > RHEL7-11384 [SR-IOV] Start a vm with a VF -- PASS > RHEL7-11388 [SR-IOV] Shutdown/Reboot a vm with a VF -- PASS > RHEL7-11396 [SR-IOV] Hot-unplug a VF from a vm -- PASS > RHEL7-11399 [SR-IOV] Hot-plug a VF into a vm -- A existed bug > RHEL7-11408 [SR-IOV] Start two virtual machines, both of which have multiple VF(s) -- PASS > RHEL7-11409 [SR-IOV] Start a vm with multiple VF(s) -- PASS > RHEL7-11410 [SR-IOV] Start a vm with a PF and multiple VF(s) -- PASS > RHEL7-11373 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Shutdown a vm with multiple VF(s) -- PASS > RHEL7-11374 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Reboot a vm with multiple VF(s) -- PASS > RHEL7-11375 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Hot plug multiple VF(s) into a vm -- A existed bug > RHEL7-11376 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Hot unplug multiple VF(s) from a vm -- PASS > RHEL7-11377 [SR-IOV][2M/1G(x86) 16M(ppc) hugepage] Start a vm with multiple VF(s) -- PASS > RHEL7-11407 [SR-IOV] Start a vm with multifunction=on VF(s) -- PASS > RHEL7-11395 [SR-IOV] Hot-unplug multiple VF from a vm -- PASS > RHEL7-11397 [SR-IOV] Hot-plug multiple VF(s) into a vm -- A existed bug > RHEL7-11428 [vfio-pf] PF sanity test -- PASS > RHEL7-11430 [vfio-pf] Shutdown a vm with a PF -- PASS > RHEL7-11431 [vfio-pf] Reboot a VF with a PF -- PASS > RHEL7-11434 [vfio-pf] Hot unplug a PF from a vm -- PASS > RHEL7-11435 [vfio-pf] Hot plug a PF into a vm -- A existed bug > RHEL7-11413 [vfio-pf] Start a vm with a PF which has a specified pci address -- PASS > RHEL7-11427 [vfio-pf] Release the PF from vm and rebind PF back to host --PASS > RHEL7-11439 [vfio-pf] Start a vm with multiple PFs -- PASS > RHEL7-11440 [vfio-pf] Start a vm with virtual network interface(s) and a PF-- PASS > RHEL7-11441 [vfio-pf] Hot unplug multiple PFs from a vm -- PASS > RHEL7-11442 [vfio-pf] Start a vm with multifunction=on PFs -- PASS > RHEL7-11444 [vfio-pf] [2M/1G(x86) 16M(ppc) hugepage] basic test -- PASS > RHEL7-110508[vfio-pf] Hot unplug and hot plug a PF after a vhost=on virtio nic is unpluged -- A existed bug > > > > The hot-plug PF/VF issue is tracked by the following two bugs: > Bug 2055123 - [Q35] Failed to hot-plug a device whose membar > 2M into the vm > Bug 2024818 - [Windows_vm][Q35+ OVMF] Some hot-plugged PF/VF can not find enough free resources that it can use (In reply to Guo, Zhiyi from comment #12) > (In reply to Yanghang Liu from comment #11) > > Hi Zhiyi, > > > > > > Could you please help update the test result for your part so that we can > > pre-verify this bug ? > > No vfio related issues found in my regression test. > > Env used: > Test Env: > host: > 5.14.0-78.mr701_220418_1703.el9.x86_64 > qemu-kvm-7.0.0-1.el9.x86_64 > guest: > kernel-5.14.0-92.el9.x86_64 > Win10 > > Devices tested for GPU passthrough: > 1x Nvidia A100 > 2x Nvidia 16 > > NVIDIA Driver used: 512.59/GRID 14.1 RC(512.78) > > Devices tested for vGPU: > 1x GRID A100-40C > 8x NVIDIA A16-2Q > 8x GRID RTX6000-3Q > 4x i915-GVTg_V5_4 > > NVIDIA Driver used: GRID 14.1, host driver 510.75, VM driver 512.78;(patched > GRID 14.0 also tested) GPU passthrough and vGPU test pass, host kernel used is 5.14.0-96.el9.x86_64 Move the bug status to VERIFIED based on Comment 17 and Comment 18. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: kernel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:8267 |