Bug 1463285

Summary: mediated devices are not shown in nodedev-list --cap mdev output
Product: Red Hat Enterprise Linux 7 Reporter: Erik Skultety <eskultet>
Component: libvirtAssignee: Erik Skultety <eskultet>
Status: CLOSED ERRATA QA Contact: zhe peng <zpeng>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: dyuan, jdenemar, jsuchane, rbalakri, xuzhang
Target Milestone: rcKeywords: TestBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-3.9.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 10:50:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1376907, 1528122    
Bug Blocks: 1452072, 1469590    

Description Erik Skultety 2017-06-20 13:38:56 UTC
Description of problem:
If a mediated device is created during libvirtd run, there's a chance the device won't be listed by the following command:

#virsh nodedev-list --cap mdev

Version-Release number of selected component (if applicable):
libvirt-3.2.0-9.virtcov.el7.x86_64

How reproducible:
depending on the environment (on some env 100%, 0% on others)

Steps to Reproduce:
1. start libvirtd
2. create a mediated device
# echo `uuidgen` > \ /sys/class/mdev_bus/<pci_address>/mdev_supported_types/nvidia-X/create
3. virsh nodedev-list --cap mdev returns an empty list

Actual results:
The device won't show up in the output list until the daemon is restarted.

Expected results:
The device shows up in the output list

Additional info:
This is apparently due to a kernel uevent race, where the 'add' uevent is sent prior to creating the whole sysfs device tree, thus at the time of processing the device, some of the requested sysfs attributes might not be exposed yet.

Until this is fixed in the upstream kernel, libvirt needs a workaround fix.

Comment 5 Erik Skultety 2017-08-28 11:25:59 UTC
v3 of the workaround posted upstream:
https://www.redhat.com/archives/libvir-list/2017-August/msg00703.html

Comment 6 Erik Skultety 2017-10-19 07:18:57 UTC
Workaround pushed upstream:

commit 1af45804088c5b1f19fca8631821ba5ae94cf3dd
Author:     Erik Skultety <eskultet>
AuthorDate: Tue Jun 20 16:15:22 2017 +0200
Commit:     Erik Skultety <eskultet>
CommitDate: Thu Oct 19 08:54:53 2017 +0200

    nodedev: udev: Hook up virFileWaitForAccess to work around uevent race

    If we find ourselves in the situation that the 'add' uevent has been
    fired earlier than the sysfs tree for a device was created, we should
    use the best-effort approach and give kernel some predetermined amount
    of time, thus waiting for the attributes to be ready rather than
    discarding the device from our device list forever. If those don't appear
    in the given time frame, we need to move on, since libvirt can't wait
    indefinitely.

    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285

    Signed-off-by: Erik Skultety <eskultet>

Comment 8 zhe peng 2018-01-25 03:37:24 UTC
verify with build:
libvirt-3.9.0-7.el7.x86_64
kernel-3.10.0-837.el7.x86_64
NVIDIA-Linux-x86_64-390.21-vgpu-kvm.run

step:
1. start libvirtd
2. create a mediated device
# echo `uuidgen` > /sys/class/mdev_bus/<pci_address>/mdev_supported_types/nvidia-X/create
3. virsh nodedev-list --cap mdev
mdev_1a8a1bbc_a095_4c59_8a17_341a3eff9c63

the device can show up, move to verified.

Comment 12 errata-xmlrpc 2018-04-10 10:50:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0704