Bug 2109450
Summary: | libvirt doesn't catch mdevs created thru sysfs | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Sylvain Bauza <sbauza> | |
Component: | libvirt | Assignee: | Jonathon Jongsma <jjongsma> | |
libvirt sub component: | General | QA Contact: | zhentang <zhetang> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | bdobreli, bsawyers, camorris, chhu, dyuan, dzheng, egallen, fjin, gveitmic, jdenemar, jsuchane, kchamart, lmen, smooney, virt-maint, xuzhang, yafu, ymankad, zhetang | |
Version: | 9.0 | Keywords: | AutomationTriaged, Regression, Triaged, ZStream | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-8.7.0-1.el9 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2141364 2141365 (view as bug list) | Environment: | ||
Last Closed: | 2023-05-09 07:26:34 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | 8.7.0 | |
Embargoed: | ||||
Bug Depends On: | 2124466 | |||
Bug Blocks: | 1761861, 2109616, 2109621, 2123586, 2141364, 2141365 |
Description
Sylvain Bauza
2022-07-21 09:37:20 UTC
I tried using the libvirt createDevice API by providing a device XML in order to verify whether this regression was due to the use of sysfs. Unfortunately, the behaviour remains the same : http://pastebin.test.redhat.com/1066905 With a minimal definition of a mdev : <device> <parent>pci_0000_04_00_0</parent> <capability type='mdev'> <type id='nvidia-320'/> <uuid>728781db-ce0b-473e-aae0-e7ab2c5ece93</uuid> </capability> </device> then the virsh call fails but the mdev is created : [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-create test.xml WARN[0000] binary not found, container dns will not be enabled error: Failed to create node device from test.xml error: An error occurred, but the cause is unknown [root@computesriov-0 heat-admin]# ll /sys/bus/mdev/devices/ total 0 lrwxrwxrwx. 1 root root 0 Jul 22 08:25 728781db-ce0b-473e-aae0-e7ab2c5ece93 -> ../../../devices/pci0000:00/0000:00:02.0/0000:04:00.0/728781db-ce0b-473e-aae0-e7ab2c5ece93 That said, we don't see the mdev thru the libvirt API until we restart the nodedev daemon : [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-list --cap mdev WARN[0000] binary not found, container dns will not be enabled [root@computesriov-0 heat-admin]# systemctl restart tripleo_nova_virtnodedevd.service [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-list --cap mdev WARN[0000] binary not found, container dns will not be enabled mdev_728781db_ce0b_473e_aae0_e7ab2c5ece93_0000_04_00_0 Also, please note this issue is also impacting other upstream operators that *don't* use OSP17, so this isn't a deployment problem : https://bugs.launchpad.net/nova/+bug/1981631 I also noted that deleting a mdev thru sysfs is automatically seen by libvirt without requiring a nodedev recycling : [root@computesriov-0 heat-admin]# echo 1 > /sys/bus/mdev/devices/728781db-ce0b-473e-aae0-e7ab2c5ece93/remove [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-list --cap mdev WARN[0000] binary not found, container dns will not be enabled [root@computesriov-0 heat-admin]# cat /sys/class/mdev_bus/0000\:04\:00.0/mdev_supported_types/nvidia-320/available_instances 2 Looks like mdevctl is correctly seeing the created mdev but not libvirt : [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-create test.xml WARN[0000] binary not found, container dns will not be enabled error: Failed to create node device from test.xml error: An error occurred, but the cause is unknown [root@computesriov-0 heat-admin]# cat /sys/class/mdev_bus/0000\:04\:00.0/mdev_supported_types/nvidia-320/available_instances 1 [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud mdevctl list WARN[0000] binary not found, container dns will not be enabled 728781db-ce0b-473e-aae0-e7ab2c5ece93 0000:04:00.0 nvidia-320 manual [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-list --cap mdev WARN[0000] binary not found, container dns will not be enabled Last piece of information, I tried creating a mdev using the exact same XML from the previous removed mdev and it continues to fail (so basically, the issue isn't due to a lack of details in the XML) : ## GET THE PREVIOUS XML FROM A MDEV [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-dumpxml mdev_728781db_ce0b_473e_aae0_e7ab2c5ece93_0000_04_00_0 WARN[0000] binary not found, container dns will not be enabled <device> <name>mdev_728781db_ce0b_473e_aae0_e7ab2c5ece93_0000_04_00_0</name> <path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/728781db-ce0b-473e-aae0-e7ab2c5ece93</path> <parent>pci_0000_04_00_0</parent> <driver> <name>vfio_mdev</name> </driver> <capability type='mdev'> <type id='nvidia-320'/> <uuid>728781db-ce0b-473e-aae0-e7ab2c5ece93</uuid> <iommuGroup number='109'/> </capability> </device> ## DELETE THE OLD MDEV [root@computesriov-0 heat-admin]# echo 1 > /sys/bus/mdev/devices/728781db-ce0b-473e-aae0-e7ab2c5ece93/remove [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-list --cap mdev WARN[0000] binary not found, container dns will not be enabled ## UPDATE THE XML ## [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud cat test.xml WARN[0000] binary not found, container dns will not be enabled <device> <name>mdev_728781db_ce0b_473e_aae0_e7ab2c5ece93_0000_04_00_0</name> <path>/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/728781db-ce0b-473e-aae0-e7ab2c5ece93</path> <parent>pci_0000_04_00_0</parent> <driver> <name>vfio_mdev</name> </driver> <capability type='mdev'> <type id='nvidia-320'/> <uuid>728781db-ce0b-473e-aae0-e7ab2c5ece93</uuid> <iommuGroup number='109'/> </capability> </device> [root@computesriov-0 heat-admin]# cat /sys/class/mdev_bus/0000\:04\:00.0/mdev_supported_types/nvidia-320/available_instances 2 [root@computesriov-0 heat-admin]# ll /sys/bus/mdev/devices/ total 0 [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-create test.xml WARN[0000] binary not found, container dns will not be enabled error: Failed to create node device from test.xml error: An error occurred, but the cause is unknown [root@computesriov-0 heat-admin]# podman exec -it nova_virtqemud virsh nodedev-list --cap mdev WARN[0000] binary not found, container dns will not be enabled I did some manual testing today. running udevadm monitor on the host I can see the udev events when I add or remove a mediated device KERNEL[259776.790864] add /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [259776.793732] add /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) KERNEL[259776.998323] add /devices/virtual/vfio/109 (vfio) KERNEL[259776.998346] bind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [259776.999648] bind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [259777.004647] add /devices/virtual/vfio/109 (vfio) KERNEL[259908.099801] remove /devices/virtual/vfio/109 (vfio) UDEV [259908.102185] remove /devices/virtual/vfio/109 (vfio) KERNEL[259908.312512] unbind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) KERNEL[259908.312533] remove /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [259908.313050] unbind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [259908.313149] remove /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) KERNEL[260459.337234] add /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [260459.340162] add /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) KERNEL[260459.590260] add /devices/virtual/vfio/109 (vfio) KERNEL[260459.590289] bind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [260459.591622] bind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [260459.596676] add /devices/virtual/vfio/109 (vfio) KERNEL[272070.995805] remove /devices/virtual/vfio/109 (vfio) UDEV [272070.998217] remove /devices/virtual/vfio/109 (vfio) KERNEL[272071.207298] unbind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) KERNEL[272071.207323] remove /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [272071.207824] unbind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [272071.207922] remove /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) udevadm cannot be run in the container any more tanks to https://github.com/systemd/systemd/pull/11346/commits/b05a4c950a928b3407f983f4d9a8a2ff8cbc34f0 so I cant actually run it in the Podman container [root@computesriov-1 /]# udevadm monitor Running in chroot, ignoring request. if I newly restart the nodedevd container and list the nodedevs as root then allocate an mdev and list again it will not be in the list returned by libvirt but if I then list its as nova it generally will be once. so by swapping from the root session to the nova user session the per session nature of the cache allows the second request to work without restarting the contianer [heat-admin@computesriov-1 ~]$ sudo podman exec -it nova_virtnodedevd virsh nodedev-list | grep mdev WARN[0000] binary not found, container dns will not be enabled [heat-admin@computesriov-1 ~]$ sudo podman exec -it -u nova nova_virtnodedevd virsh nodedev-list | grep mdev WARN[0000] binary not found, container dns will not be enabled mdev_710766c7_994c_40d1_be44_1a670bdfece2_0000_04_00_0 [heat-admin@computesriov-1 ~]$ sudo podman exec -it nova_virtnodedevd virsh nodedev-list | grep mdev WARN[0000] binary not found, container dns will not be enabled that confirms that libvirt could see the device if it was not for the cacheing. in the contianer the device is also clearly visible in in /sys reinforcing the idea that this is in fact a issue with the not udev events in so form oddly deleting the mdev seems to propagate to the vm. [heat-admin@computesriov-1 ~]$ sudo systemctl restart tripleo_nova_virtnodedevd [heat-admin@computesriov-1 ~]$ sudo podman exec -it nova_virtnodedevd virsh nodedev-list | grep mdev WARN[0000] binary not found, container dns will not be enabled mdev_710766c7_994c_40d1_be44_1a670bdfece2_0000_04_00_0 [heat-admin@computesriov-1 ~]$ echo 1 | sudo tee /sys/bus/mdev/devices/*/remove 1 [heat-admin@computesriov-1 ~]$ sudo podman exec -it nova_virtnodedevd virsh nodedev-list | grep mdev WARN[0000] binary not found, container dns will not be enabled [heat-admin@computesriov-1 ~]$ we run the virtnodedevd container with /dev and /run mounted form the host as well as --net=host --pid=host and --privileged so srw-------. 1 root root 0 Aug 5 16:46 /run/udev/control. the socket is available in the container and libvirt should have all permissions to interact with it. since the remove works but the add does not that implies that at least some of the udev messages are being picked up. this is interesting 2022-08-08 17:06:07.560+0000: 12351: debug : virNetlinkEventCallback:889 : event not handled. 2022-08-08 17:06:07.668+0000: 12415: error : udevProcessMediatedDevice:1038 : failed to wait for file '/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2/mdev_type' to appear: No such file or director that does in fact exists and can be seen form inside the nodedevd container [root@computesriov-1 libvirt]# ls /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2/mdev_type -al lrwxrwxrwx. 1 root root 0 Aug 8 21:48 /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2/mdev_type -> ../mdev_supported_types/nvidia-222 root 132387 0.0 0.0 1509456 22660 ? Sl 20:59 0:00 /usr/sbin/virtnodedevd --config /etc/libvirt/virtnodedevd.conf virtnodedevd is running as root inside the container just in case you were wondering if there was a permissions issue. 2022-08-08 21:00:23.756+0000: 132387: debug : virConnectClose:1316 : conn=0x7f1d78013690 2022-08-08 21:48:21.631+0000: 132387: debug : virNetlinkEventCallback:875 : dispatching to max 0 clients, called from event watch 6 2022-08-08 21:48:21.631+0000: 132387: debug : virNetlinkEventCallback:889 : event not handled. 2022-08-08 21:48:21.739+0000: 132440: error : udevProcessMediatedDevice:1038 : failed to wait for file '/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2/mdev_type' to appear: No such file or directory 2022-08-08 21:48:21.996+0000: 132387: debug : virNetlinkEventCallback:875 : dispatching to max 0 clients, called from event watch 6 2022-08-08 21:48:21.996+0000: 132387: debug : virNetlinkEventCallback:889 : event not handled. 2022-08-08 21:48:21.996+0000: 132387: debug : virNetlinkEventCallback:875 : dispatching to max 0 clients, called from event watch 6 2022-08-08 21:48:21.996+0000: 132387: debug : virNetlinkEventCallback:889 : event not handled. im not seing ny selinux denieals or similare that would indicate why libvirt cant red this but it almost looks like a race. https://github.com/libvirt/libvirt/blob/3d5245e3ebd1e143ea858c8535474b681bc21a38/src/node_device/node_device_udev.c#L1039-L1044 this is where its currently failing. /* Because of a kernel uevent race, we might get the 'add' event prior to * the sysfs tree being ready, so any attempt to access any sysfs attribute * would result in ENOENT and us dropping the device, so let's work around * it by waiting for the attributes to become available. */ that is the doc string for the function https://github.com/libvirt/libvirt/blob/3d5245e3ebd1e143ea858c8535474b681bc21a38/src/node_device/node_device_udev.c#L1031-L1035 so ya I would say there is a high probablity that we are losing that race and 100 ms is not enough time KERNEL[260459.337234] add /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [260459.340162] add /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) KERNEL[260459.590260] add /devices/virtual/vfio/109 (vfio) KERNEL[260459.590289] bind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [260459.591622] bind /devices/pci0000:00/0000:00:02.0/0000:04:00.0/710766c7-994c-40d1-be44-1a670bdfece2 (mdev) UDEV [260459.596676] add /devices/virtual/vfio/109 (vfio) the full add time 260ms and about 230ms for it to be would. it looks like libvirt is being optimistic virFileWaitForExists takes time in milliseonds and tries as the args https://github.com/libvirt/libvirt/commit/caf26412b691bf6a7cb34b9db837b92a4e6eb689 so if (virFileWaitForExists(linkpath, 1, 100) < 0) { virReportSystemError(errno, _("failed to wait for file '%s' to appear"), linkpath); return -1; } is waiting at most 100ms and it took much longer for the device init to complete. So in this specific case, the dealy is around 250ms. But I wonder what sort of variation we can expect? I don't really want to propose changing it to another semi-arbitary value only to find out that this value is also not sufficient... Sean, any ideas about my last question? It would be nice to get an idea about what sort of ranges you're seeing for the delay time so that I can pick an appropriate value that will work. i honestly dont know. the short hack would be to make the sleep time 10 ms instead of one allowing this to take up to 1 second instead of 100ms. the better fix i think would be to trigger off the bind udev event instead of the add or in addition to the add event. that i think would eliminate the race but I'm not 100% sure about that. i don't have access to this env permanently its one of our qe test systems so just looking at the time from comment 13 we have 208ms 259ms and 260 for comment 16 so it seams to be around a quarter of a second. so changing the sleep time to 10ms and keeping 100 retries to extend the total time to 1 second likely should be enough but also trigiging on the bind udev event likely would be more robust. those are not mutually exclusive either so you could do both to harden this more. i don't think there should be any issue with updating the list twice if we process both the add and bind udev event successfully you just need to ensure the bind even checks that the device is not already present. NB: Changing the ITR to 9.2.0 as we are getting late in the process for inclusion into 9.1.0 of an issue that while is easily fixed/resolved, but perhaps needs some more design discussion (e.g. hard-coded 100 or 200 or 500 timeout value vs. customize-able value)... All for a problem from another subsystem (udev) which doesn't guarantee instantaneous and/or synchronous creation of the device. In the long run "where" the fix should be could be up for discussion. Should Nova be the place where the timeout exists since it's the place that initiated the creation? Leaving the ZTR at 9.0.0 since that's where the z-stream would need to be eventually. We can move it back to 9.1.0, but we're at the point of needing an exception... Since RHOS won't be consuming 9.1.0, it's unlikely to be granted. The fix for this has been upstream since libvirt 8.7.0 (commit e4f9682ebc442bb5dfee807ba618c8863355776d). RHEL 9.2.0 currently ships libvirt 8.9.0, so this bug should already be fixed in 9.2.0 by bug #2124466. Already in libvirt-8.7.0-1.el9. Reproduced on OSP17.0 with libvirt packages: $ sudo podman exec -it nova_virtqemud rpm -qa | grep libvirt| grep driver libvirt-daemon-driver-secret-8.0.0-8.1.el9_0.x86_64 libvirt-daemon-driver-nwfilter-8.0.0-8.1.el9_0.x86_64 libvirt-daemon-driver-storage-core-8.0.0-8.1.el9_0.x86_64 libvirt-daemon-driver-nodedev-8.0.0-8.1.el9_0.x86_64 libvirt-daemon-driver-qemu-8.0.0-8.1.el9_0.x86_64 Reproduced steps: 1. Prepare the vGPU environment on OSP17.0 [heat-admin@compute-0 ~]$ cat /sys/class/mdev_bus/0000\:3d\:00.0/mdev_supported_types/nvidia-22/name GRID M60-8Q [heat-admin@compute-0 ~]$ uuid=$(uuidgen) [heat-admin@compute-0 ~]$ cd /sys/class/mdev_bus/0000:3d:00.0/mdev_supported_types/nvidia-22 [heat-admin@compute-0 nvidia-22]$ sudo chmod 666 create [heat-admin@compute-0 nvidia-22]$ sudo echo $uuid 5b5ed2cb-cd00-4afe-b975-1fe467a1757f [heat-admin@compute-0 nvidia-22]$ sudo echo $uuid > create [heat-admin@compute-0 nvidia-22]$ cd ../../ [heat-admin@compute-0 0000:3d:00.0]$ ls 5b5ed2cb-cd00-4afe-b975-1fe467a1757f config driver_override local_cpulist ...... 2. Check in nova_virtqemud, mdev is not present in the list of node devices [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud python3 Python 3.9.10 (main, Feb 9 2022, 00:00:00) [GCC 11.2.1 20220127 (Red Hat 11.2.1-9)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import libvirt >>> conn = libvirt.open('qemu:///system') >>> conn.listDevices('mdev') [] 3. Restart the nova_virtnodedevd [heat-admin@compute-0 0000:3d:00.0]$ sudo systemctl restart tripleo_nova_virtnodedevd.service [heat-admin@compute-0 ~]$ sudo podman ps|grep nova_virt ...... 967b3774fcfd dell-per740-66.ctlplane.localdomain:8787/rh-osbs/rhosp17-openstack-nova-libvirt:17.0_20220908.1 kolla_start 10 days ago Up 6 seconds ago nova_virtnodedevd 4a07bd7374b2 dell-per740-66.ctlplane.localdomain:8787/rh-osbs/rhosp17-openstack-nova-libvirt:17.0_20220908.1 kolla_start 10 days ago Up 40 minutes ago nova_virtstoraged a5dc3c6fa7c2 dell-per740-66.ctlplane.localdomain:8787/rh-osbs/rhosp17-openstack-nova-libvirt:17.0_20220908.1 kolla_start 10 days ago Up 40 minutes ago nova_virtqemud 4. Check in nova_virtqemud, mdev is present in the list of node devices [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud python3 Python 3.9.10 (main, Feb 9 2022, 00:00:00) [GCC 11.2.1 20220127 (Red Hat 11.2.1-9)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import libvirt >>> conn = libvirt.open('qemu:///system') >>> conn.listDevices('mdev') ['mdev_5b5ed2cb_cd00_4afe_b975_1fe467a1757f_0000_3d_00_0'] >>> (In reply to chhu from comment #37) > Reproduced on OSP17.0 with libvirt packages: > $ sudo podman exec -it nova_virtqemud rpm -qa | grep libvirt| grep driver > libvirt-daemon-driver-secret-8.0.0-8.1.el9_0.x86_64 > libvirt-daemon-driver-nwfilter-8.0.0-8.1.el9_0.x86_64 > libvirt-daemon-driver-storage-core-8.0.0-8.1.el9_0.x86_64 > libvirt-daemon-driver-nodedev-8.0.0-8.1.el9_0.x86_64 > libvirt-daemon-driver-qemu-8.0.0-8.1.el9_0.x86_64 This problem is not expected to be fixed yet in libvirt 8.0.0-8.1.el9_0. This bug is about the libvirt version in rhel 9.2. The problem should be fixed in libvirt-8.7.0-1.el9 (rhel-9.2) and is in the process of being backported to older releases. See cloned bugs mentioned above for those releases. Tested on on OSP17.0 with libvirt packages:
[heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud rpm -qa | grep libvirt| grep driver
libvirt-daemon-driver-nwfilter-8.9.0-2.el9.x86_64
libvirt-daemon-driver-qemu-8.9.0-2.el9.x86_64
libvirt-daemon-driver-storage-core-8.9.0-2.el9.x86_64
libvirt-daemon-driver-nodedev-8.9.0-2.el9.x86_64
libvirt-daemon-driver-secret-8.9.0-2.el9.x86_64
Test steps:
1. Prepare the vGPU environment on OSP17.0
(undercloud) [stack@dell-per740-66 ~]$ ssh heat-admin.24.10
[heat-admin@compute-0 ~]$ lspci|grep VGA
03:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. Integrated Matrox G200eW3 Graphics Controller (rev 04)
3d:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)
3e:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)
[heat-admin@compute-0 ~]$ cat /sys/class/mdev_bus/0000\:3d\:00.0/mdev_supported_types/nvidia-22/name
GRID M60-8Q
[heat-admin@compute-0 ~]$ uuid=$(uuidgen)
[heat-admin@compute-0 ~]$ cd /sys/class/mdev_bus/0000:3d:00.0/mdev_supported_types/nvidia-22
[heat-admin@compute-0 nvidia-22]$ sudo chmod 666 create
[heat-admin@compute-0 nvidia-22]$ sudo echo $uuid
b81a2fb4-1bcf-45b0-b61e-efba7f35b161
[heat-admin@compute-0 nvidia-22]$ sudo echo $uuid > create
[heat-admin@compute-0 nvidia-22]$ cd ../../
[heat-admin@compute-0 0000:3d:00.0]$ ls
b81a2fb4-1bcf-45b0-b61e-efba7f35b161 d3cold_allowed iommu mdev_supported_types rescan resource3_wc
2. Check in nova_virtqemud, mdev is present in the list of node devices
[heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud python3
Python 3.9.10 (main, Feb 9 2022, 00:00:00)
[GCC 11.2.1 20220127 (Red Hat 11.2.1-9)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import libvirt
>>> conn = libvirt.open('qemu:///system')
>>> conn.listDevices('mdev')
['mdev_b81a2fb4_1bcf_45b0_b61e_efba7f35b161_0000_3d_00_0']
Failed to start a VM with vGPU in OSP17.0, filed OpenStack Bug2142768, but it'll not block this bug's verification. Bug 2142768 - Failed to create VM with vGPU - Hit error "badly formed hexadecimal UUID string" Add more test results here: 1. Create and delete the mdev device by using uuid, check the available_instances and `virsh nodedev-list` outputs are correct [heat-admin@compute-0 ~]$ cat /sys/class/mdev_bus/0000\:3d\:00.0/mdev_supported_types/nvidia-22/available_instances 1 [heat-admin@compute-0 ~]$ uuid=$(uuidgen) [heat-admin@compute-0 ~]$ sudo echo $uuid > /sys/class/mdev_bus/0000:3d:00.0/mdev_supported_types/nvidia-22/create [heat-admin@compute-0 ~]$ sudo ls /sys/class/mdev_bus/0000:3d:00.0| grep $uuid 639913d4-247e-41e2-ac46-2e1eb4b32730 [heat-admin@compute-0 ~]$ cat /sys/class/mdev_bus/0000\:3d\:00.0/mdev_supported_types/nvidia-22/available_instances 0 [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud virsh nodedev-list --cap mdev mdev_639913d4_247e_41e2_ac46_2e1eb4b32730_0000_3d_00_0 [heat-admin@compute-0 ~]$ sudo chmod 666 /sys/bus/mdev/devices/639913d4-247e-41e2-ac46-2e1eb4b32730/remove [heat-admin@compute-0 ~]$ sudo echo 1 > /sys/bus/mdev/devices/639913d4-247e-41e2-ac46-2e1eb4b32730/remove [heat-admin@compute-0 ~]$ sudo ls /sys/class/mdev_bus/0000:3d:00.0| grep $uuid No output [heat-admin@compute-0 ~]$ cat /sys/class/mdev_bus/0000\:3d\:00.0/mdev_supported_types/nvidia-22/available_instances 1 [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud virsh nodedev-list --cap mdev No output 2. Create and delete the mdev device by virsh commands, check the `virsh nodedev-create,destroy,list,dumpxml` outputs are correct [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud cat mdev.xml <device> <parent>pci_0000_3d_00_0</parent> <capability type='mdev'> <type id='nvidia-22'/> <uuid>d7277b0f-ef00-4fc9-bcc3-300b0b33a638</uuid> </capability> </device> [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud virsh nodedev-create mdev.xml Node device mdev_d7277b0f_ef00_4fc9_bcc3_300b0b33a638_0000_3d_00_0 created from mdev.xml [heat-admin@compute-0 ~]$ cat /sys/class/mdev_bus/0000\:3d\:00.0/mdev_supported_types/nvidia-22/available_instances 0 [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud virsh nodedev-list --cap mdev mdev_d7277b0f_ef00_4fc9_bcc3_300b0b33a638_0000_3d_00_0 [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud virsh nodedev-dumpxml mdev_d7277b0f_ef00_4fc9_bcc3_300b0b33a638_0000_3d_00_0 <device> <name>mdev_d7277b0f_ef00_4fc9_bcc3_300b0b33a638_0000_3d_00_0</name> <path>/sys/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0/0000:3c:08.0/0000:3d:00.0/d7277b0f-ef00-4fc9-bcc3-300b0b33a638</path> <parent>pci_0000_3d_00_0</parent> <driver> <name>nvidia-vgpu-vfio</name> </driver> <capability type='mdev'> <type id='nvidia-22'/> <uuid>d7277b0f-ef00-4fc9-bcc3-300b0b33a638</uuid> <parent_addr>0000:3d:00.0</parent_addr> <iommuGroup number='138'/> </capability> </device> [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud virsh nodedev-destroy mdev_d7277b0f_ef00_4fc9_bcc3_300b0b33a638_0000_3d_00_0 Destroyed node device 'mdev_d7277b0f_ef00_4fc9_bcc3_300b0b33a638_0000_3d_00_0' [heat-admin@compute-0 ~]$ sudo podman exec -it nova_virtqemud virsh nodedev-list --cap mdev No output [heat-admin@compute-0 ~]$ cat /sys/class/mdev_bus/0000\:3d\:00.0/mdev_supported_types/nvidia-22/available_instances 1 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2171 |