Bug 1812586

Summary: [SR-IOV] SR-IOV information is missing because of 'block_path is None'
Product: [oVirt] vdsm Reporter: Dominik Holler <dholler>
Component: GeneralAssignee: Milan Zamazal <mzamazal>
Status: CLOSED CURRENTRELEASE QA Contact: Michael Burman <mburman>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.40.5CC: bugs, mburman, michal.skrivanek, mzamazal, rbarry
Target Milestone: ovirt-4.4.0Keywords: Automation, AutomationBlocker, Regression
Target Release: 4.40.6Flags: michal.skrivanek: ovirt-4.4?
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm-4.40.6 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-20 20:02:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
example vdsm.log none

Description Dominik Holler 2020-03-11 15:55:48 UTC
Created attachment 1669395 [details]
example vdsm.log

Description of problem:

Without HostDevListByCapsVDSCommand a hint in audit log, HostDevListByCapsVDSCommand is failing like this:

2020-03-11 16:48:29,321+01 ERROR [org.ovirt.engine.core.vdsbroker.HostDevListByCapsVDSCommand] (EE-ManagedThreadFactory-engine-Thread-25) [26e5e60d] Failed in 'HostDevListByCapsVDS' method, for vds: 'xxx'; host: 'xxx': null
2020-03-11 16:48:29,322+01 ERROR [org.ovirt.engine.core.vdsbroker.HostDevListByCapsVDSCommand] (EE-ManagedThreadFactory-engine-Thread-25) [26e5e60d] Command 'HostDevListByCapsVDSCommand(HostName = xxx, VdsIdAndVdsVDSCommandParametersBase:{hostId='0923631c-48f6-4116-8e7c-0bec35533209', vds='Host[lxxx,0923631c-48f6-4116-8e7c-0bec35533209]'})' execution failed: null
2020-03-11 16:48:29,324+01 ERROR [org.ovirt.engine.core.bll.network.host.RefreshHostCommand] (EE-ManagedThreadFactory-engine-Thread-25) [26e5e60d] Transaction rolled-back for command 'org.ovirt.engine.core.bll.network.host.RefreshHostCommand'.

Because block_path is null.

This leads to a sutation that no Host Devices are known to the Engine and SR-IOV is not usable anymore.


Version-Release number of selected component (if applicable):
vdsm-4.40.5-1.el8ev.x86_64


How reproducible:
100% on a dedicated host


Steps to Reproduce:
1. Refresh Capabilities from Engine


Actual results:
Host Devices are not populated in Engine.

Expected results:
Host Devices are populated in Engine.

Additional info:
Engine code assumes that block_path is not null, if the attribute exists in
https://github.com/oVirt/ovirt-engine/blob/d11f8bb7e50d3720d736c99e457c2723e109dfd9/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/vdsbroker/VdsBrokerObjectsBuilder.java#L2609

The relevant code was introduced in VDSM by https://gerrit.ovirt.org/#/c/106776/ .

Comment 1 Michal Skrivanek 2020-03-11 16:12:18 UTC
the code in https://gerrit.ovirt.org/#/c/106776/5/lib/vdsm/common/hostdev.py should have been added only for scsi block devices, not all the devices (like sr-iov). In such case it shouldn't raise an error and it shouldn't add a None/null value.

should be an easy fix

Comment 2 Milan Zamazal 2020-03-12 11:10:47 UTC
I've fixed a Python 3 issue in Vdsm host device processing yesterday, which was most likely the cause of this issue (it put null as udev_path to all SCSI devices). Can you check whether the current Vdsm master (e.g. https://resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el8/x86_64/vdsm-4.40.5-47.git038f1ef5b.el8.x86_64.rpm) fixes the problem?

Comment 6 Ryan Barry 2020-03-12 13:44:38 UTC
Milan, care to attach the patch?

Comment 7 Milan Zamazal 2020-03-12 13:55:34 UTC
Patch attached and I took the bug.

Comment 8 Sandro Bonazzola 2020-03-20 09:41:35 UTC
$ git tag --contains afa195af73bbde5e3780ea1cbe1ca58e388c9422
v4.40.6
v4.40.7

Moving to QA

Comment 9 Michael Burman 2020-03-22 10:20:11 UTC
Verified on - vdsm-4.40.7-1.el8ev.x86_64

Comment 10 Sandro Bonazzola 2020-05-20 20:02:18 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.