Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1429420

Summary: hostdev.list_by_caps() fails if 'drm' capability reported by libvirt
Product: [oVirt] vdsm Reporter: Yanqiu Zhang <yanqzhan>
Component: GeneralAssignee: Martin Polednik <mpoledni>
Status: CLOSED CURRENTRELEASE QA Contact: Nisim Simsolo <nsimsolo>
Severity: high Docs Contact:
Priority: high    
Version: 4.19.7CC: bugs, chhu, danken, dyuan, hhan, lveyde, michal.skrivanek, mpoledni, mzhan, nsimsolo, pzhang, yanqzhan, yisun
Target Milestone: ovirt-4.1.2Keywords: Regression, TestBlocker
Target Release: 4.19.11Flags: rule-engine: ovirt-4.1+
rule-engine: blocker+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-23 08:12:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yanqiu Zhang 2017-03-06 11:07:19 UTC
Description of problem:
Fail to start vdsm-network.service

Version-Release number of selected component (if applicable):
vdsm-4.19.7-1.el7ev.x86_64
libvirt-3.1.0-1.el7.x86_64
rhevm-4.1.1.2-0.1.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1.config abrt env
# yum install abrt-cli augeas -y; augtool set /files/etc/abrt/abrt-action-save-package-data.conf/OpenGPGCheck no ; augtool set /files/etc/abrt/abrt.conf/AutoreportingEnabled yes; systemctl start abrtd; systemctl start abrt-ccpp

2.install vdsm
# yum install vdsm

3.try to start vdsm-network.service
# systemctl start vdsm-network.service
Job for vdsm-network.service failed because the control process exited with error code. See "systemctl status vdsm-network.service" and "journalctl -xe" for details.


Actual results:
1.As in step3, fail to start vdsm-network.service.
2.# abrt-cli ls
id 78922b1b22d8948f8fdcc28f94a4dc29f785e4bb
reason: hostdev.py:417:_process_device_params:KeyError: 'drm'
time: Mon 06 Mar 2017 02:48:05 AM EST
cmdline: /usr/bin/python2 /usr/share/vdsm/vdsm-restore-net-config
package: vdsm-4.19.7-1.el7ev
uid: 0 (root)
count: 2
Directory: /var/spool/abrt/Python-2017-03-06-02:48:05-32494
Run 'abrt-cli report /var/spool/abrt/Python-2017-03-06-02:48:05-32494' for creating a case in Red Hat Customer Portal

# cat backtrace
hostdev.py:417:_process_device_params:KeyError: 'drm'

Traceback (most recent call last):
  File "/usr/share/vdsm/vdsm-restore-net-config", line 510, in <module>
    restore(args)
  File "/usr/share/vdsm/vdsm-restore-net-config", line 468, in restore
    _restore_sriov_numvfs()
  File "/usr/share/vdsm/vdsm-restore-net-config", line 89, in _restore_sriov_numvfs
    sriov_devices = _get_sriov_devices()
  File "/usr/share/vdsm/vdsm-restore-net-config", line 61, in _get_sriov_devices
    devices = hostdev.list_by_caps()
  File "/usr/lib/python2.7/site-packages/vdsm/hostdev.py", line 478, in list_by_caps
    libvirt_devices = _get_devices_from_libvirt(flags)
  File "/usr/lib/python2.7/site-packages/vdsm/hostdev.py", line 451, in _get_devices_from_libvirt
    for device in libvirt_devices)
  File "/usr/lib/python2.7/site-packages/vdsm/hostdev.py", line 451, in <genexpr>
    for device in libvirt_devices)
  File "/usr/lib/python2.7/site-packages/vdsm/hostdev.py", line 417, in _process_device_params
    for data_processor in _data_processors_map()[params['capability']]:
KeyError: 'drm'

Local variables in innermost frame:
devXML: <Element 'device' at 0x2f10ba0>
params: {'capability': 'drm', 'is_assignable': 'true'}
device_xml: "<device>\n  <name>drm_card0</name>\n  <path>/sys/devices/pci0000:00/0000:00:0a.0/0000:02:00.1/drm/card0</path>\n  <devnode type='dev'>/dev/dri/card0</devnode>\n  <parent>pci_0000_02_00_1</parent>\n  <capability type='drm'>\n    <type>primary</type>\n  </capability>\n</device>\n"
caps: <Element 'capability' at 0x2f14f60>


Expected results:
vdsm-network.service should start successfully

Additional info:
This will block vdsmd.service start:
# systemctl start vdsmd
A dependency job for vdsmd.service failed. See 'journalctl -xe' for details.
# journalctl -xe
...
Mar 06 03:34:59 hp-dl385g8-01.lab.eng.pek2.redhat.com systemd[1]: Job mom-vdsm.service/start failed with result 'dependency'.
Mar 06 03:34:59 hp-dl385g8-01.lab.eng.pek2.redhat.com systemd[1]: Job vdsmd.service/start failed with result 'dependency'.
Mar 06 03:34:59 hp-dl385g8-01.lab.eng.pek2.redhat.com systemd[1]: Unit vdsm-network.service entered failed state.
Mar 06 03:34:59 hp-dl385g8-01.lab.eng.pek2.redhat.com systemd[1]: vdsm-network.service failed.
Mar 06 03:34:59 hp-dl385g8-01.lab.eng.pek2.redhat.com polkitd[950]: Unregistered Authentication Agent for unix-process:1321:7193460 (system bus name :1.167, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)

Comment 2 Dan Kenigsberg 2017-03-06 22:09:27 UTC
Martin, could you take a look?

Comment 3 Red Hat Bugzilla Rules Engine 2017-03-06 22:42:09 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 4 Martin Polednik 2017-03-07 14:14:51 UTC
The tests are run with libvirt 3.1.0 which is way newer than I'd expect. Is there any reason for that?

git tag --contains 7f1bdec5fa
v3.1.0
v3.1.0-rc1
v3.1.0-rc2

Comment 5 Han Han 2017-03-08 02:20:50 UTC
(In reply to Martin Polednik from comment #4)
> The tests are run with libvirt 3.1.0 which is way newer than I'd expect. Is
> there any reason for that?
> 
> git tag --contains 7f1bdec5fa
> v3.1.0
> v3.1.0-rc1
> v3.1.0-rc2

Yes. It is triggered by a new feature in libvirt-3.1:
https://libvirt.org/news.html
nodedev: add drm capability
Add a new 'drm' capability for Direct Rendering Manager (DRM) devices, providing device type information.

Comment 6 Red Hat Bugzilla Rules Engine 2017-03-08 14:37:23 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 7 Nisim Simsolo 2017-05-09 10:55:31 UTC
Verification build: 
Red Hat Enterprise Linux Server release 7.4 Beta (Maipo)
kernel-3.10.0-663.el7.x86_64
libvirt-client-3.2.0-4.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64
vdsm-4.19.12-1.el7ev.x86_64
sanlock-3.5.0-1.el7.x86_64