Bug 1812586 - [SR-IOV] SR-IOV information is missing because of 'block_path is None'
Summary: [SR-IOV] SR-IOV information is missing because of 'block_path is None'
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.40.5
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ovirt-4.4.0
: 4.40.6
Assignee: Milan Zamazal
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-11 15:55 UTC by Dominik Holler
Modified: 2020-05-20 20:02 UTC (History)
5 users (show)

Fixed In Version: vdsm-4.40.6
Clone Of:
Environment:
Last Closed: 2020-05-20 20:02:18 UTC
oVirt Team: Virt
Embargoed:
michal.skrivanek: ovirt-4.4?


Attachments (Terms of Use)
example vdsm.log (6.08 MB, text/plain)
2020-03-11 15:55 UTC, Dominik Holler
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 107536 0 None MERGED hostdev: Fix a bytes x str issue in udev mapping retrieval 2020-06-22 10:43:31 UTC

Description Dominik Holler 2020-03-11 15:55:48 UTC
Created attachment 1669395 [details]
example vdsm.log

Description of problem:

Without HostDevListByCapsVDSCommand a hint in audit log, HostDevListByCapsVDSCommand is failing like this:

2020-03-11 16:48:29,321+01 ERROR [org.ovirt.engine.core.vdsbroker.HostDevListByCapsVDSCommand] (EE-ManagedThreadFactory-engine-Thread-25) [26e5e60d] Failed in 'HostDevListByCapsVDS' method, for vds: 'xxx'; host: 'xxx': null
2020-03-11 16:48:29,322+01 ERROR [org.ovirt.engine.core.vdsbroker.HostDevListByCapsVDSCommand] (EE-ManagedThreadFactory-engine-Thread-25) [26e5e60d] Command 'HostDevListByCapsVDSCommand(HostName = xxx, VdsIdAndVdsVDSCommandParametersBase:{hostId='0923631c-48f6-4116-8e7c-0bec35533209', vds='Host[lxxx,0923631c-48f6-4116-8e7c-0bec35533209]'})' execution failed: null
2020-03-11 16:48:29,324+01 ERROR [org.ovirt.engine.core.bll.network.host.RefreshHostCommand] (EE-ManagedThreadFactory-engine-Thread-25) [26e5e60d] Transaction rolled-back for command 'org.ovirt.engine.core.bll.network.host.RefreshHostCommand'.

Because block_path is null.

This leads to a sutation that no Host Devices are known to the Engine and SR-IOV is not usable anymore.


Version-Release number of selected component (if applicable):
vdsm-4.40.5-1.el8ev.x86_64


How reproducible:
100% on a dedicated host


Steps to Reproduce:
1. Refresh Capabilities from Engine


Actual results:
Host Devices are not populated in Engine.

Expected results:
Host Devices are populated in Engine.

Additional info:
Engine code assumes that block_path is not null, if the attribute exists in
https://github.com/oVirt/ovirt-engine/blob/d11f8bb7e50d3720d736c99e457c2723e109dfd9/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/vdsbroker/VdsBrokerObjectsBuilder.java#L2609

The relevant code was introduced in VDSM by https://gerrit.ovirt.org/#/c/106776/ .

Comment 1 Michal Skrivanek 2020-03-11 16:12:18 UTC
the code in https://gerrit.ovirt.org/#/c/106776/5/lib/vdsm/common/hostdev.py should have been added only for scsi block devices, not all the devices (like sr-iov). In such case it shouldn't raise an error and it shouldn't add a None/null value.

should be an easy fix

Comment 2 Milan Zamazal 2020-03-12 11:10:47 UTC
I've fixed a Python 3 issue in Vdsm host device processing yesterday, which was most likely the cause of this issue (it put null as udev_path to all SCSI devices). Can you check whether the current Vdsm master (e.g. https://resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el8/x86_64/vdsm-4.40.5-47.git038f1ef5b.el8.x86_64.rpm) fixes the problem?

Comment 6 Ryan Barry 2020-03-12 13:44:38 UTC
Milan, care to attach the patch?

Comment 7 Milan Zamazal 2020-03-12 13:55:34 UTC
Patch attached and I took the bug.

Comment 8 Sandro Bonazzola 2020-03-20 09:41:35 UTC
$ git tag --contains afa195af73bbde5e3780ea1cbe1ca58e388c9422
v4.40.6
v4.40.7

Moving to QA

Comment 9 Michael Burman 2020-03-22 10:20:11 UTC
Verified on - vdsm-4.40.7-1.el8ev.x86_64

Comment 10 Sandro Bonazzola 2020-05-20 20:02:18 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.