Bug 2120726

Summary: [17.0 ga known issue] Nova fails to parse new libvirt mediated device name format
Product: Red Hat OpenStack Reporter: Artom Lifshitz <alifshit>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED EOL QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: medium Docs Contact:
Priority: medium    
Version: 17.0 (Wallaby)CC: dasmith, eglynn, jhakimra, kchamart, sbauza, sgordon, vromanso
Target Milestone: gaKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-10-10 12:23:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Artom Lifshitz 2022-08-23 15:47:35 UTC
This bug was initially created as a copy of Bug #2109616

I am copying this bug because: 

Need to document the known issue of Nova not parsing libvirt's new device name format.

Description of problem:

With libvirt 7.7, mediated device names changed, so now Nova isn't able to find them.
The impact is not trivial to see, but basically, the update of resources we do every 60 secs is now having an exception so we don't really know the right VGPU capacity left.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create an instance with a VGPU flavor
2. look at the n-cpu log, you'll see an exception every 60 secs



2021-11-19 22:51:45.952 7 ERROR nova.compute.manager [req-570c7e8f-0540-49fb-b2b0-8c2ac932e4dc - - - - -] Error updating resources for node: ValueError: badly formed hexadecimal UUID string
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager Traceback (most recent call last):
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/manager.py", line 9993, in _update_available_resource_for_node
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager startup=startup)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 895, in update_available_resource
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager self._update_available_resource(context, resources, startup=startup)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager return f(*args, **kwargs)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 975, in _update_available_resource
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager self._update(context, cn, startup=startup)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1227, in _update
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager self._update_to_placement(context, compute_node, startup)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/retrying.py", line 49, in wrapped_f
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager return Retrying(*dargs, **dkw).call(f, *args, **kw)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/retrying.py", line 206, in call
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager return attempt.get(self._wrap_exception)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/retrying.py", line 247, in get
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager six.reraise(self.value[0], self.value[1], self.value[2])
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/usr/local/lib/python3.6/site-packages/six.py", line 719, in reraise
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager raise value
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/retrying.py", line 200, in call
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1163, in _update_to_placement
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager self.driver.update_provider_tree(prov_tree, nodename)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 8355, in update_provider_tree
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager provider_tree, nodename, allocations=allocations)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 8757, in _update_provider_tree_for_vgpu
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager inventories_dict = self._get_gpu_inventories()
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 7597, in _get_gpu_inventories
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager count_per_parent = self._count_mediated_devices(enabled_mdev_types)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 7538, in _count_mediated_devices
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager mediated_devices = self._get_mediated_devices(types=enabled_mdev_types)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 7788, in _get_mediated_devices
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager device = self._get_mediated_device_information(name)
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 7769, in _get_mediated_device_information
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager "uuid": libvirt_utils.mdev_name2uuid(cfgdev.name),
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/utils.py", line 583, in mdev_name2uuid
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager return str(uuid.UUID(mdev_name[5:].replace('_', '-')))
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager File "/usr/lib64/python3.6/uuid.py", line 140, in __init__
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager raise ValueError('badly formed hexadecimal UUID string')
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager ValueError: badly formed hexadecimal UUID string
2021-11-19 22:51:45.952 7 ERROR nova.compute.manager


A proposal fix against upstrea master is already on the fly, we need to backport it ASAP once it's merged down to 17.0.
https://review.opendev.org/c/openstack/nova/+/838976

Comment 1 Priscila Gutierres 2023-10-10 12:23:24 UTC
Support has ended on 22 September 2023.