Bug 979290

Summary: The driver in output of virsh command nodedev-dumpxml should be updated timely
Product: Red Hat Enterprise Linux 7 Reporter: Xuesong Zhang <xuzhang>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: acathrow, cwei, dyuan, harald, honzhang, kay, laine, tzheng
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.1.0-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 979330 (view as bug list) Environment:
Last Closed: 2014-06-13 12:02:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 979330    

Description Xuesong Zhang 2013-06-28 07:32:06 UTC
Description of problem:
The driver in output of virsh command nodedev-dumpxml should be updated timely.

Version-Release number of selected component (if applicable):
libvirt-1.0.6-1.el7.x86_64
qemu-kvm-1.5.0-2.el7.x86_64
3.10.0-0.rc6.63.el7.x86_64

How reproducible:
100%

Steps:
1. check the pci driver with virsh command nodedev-dumpxml.
# virsh nodedev-dumpxml pci_0000_04_00_0
<device>
  <name>pci_0000_04_00_0</name>
  <parent>pci_0000_00_1e_0</parent>
  <driver>
    <name>e1000</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>4</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x107c'>82541PI Gigabit Ethernet Controller</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
  </capability>
</device>
2. check it's driver with kernel command
# readlink -f /sys/bus/pci/devices/0000\:04\:00.0/driver/
/sys/bus/pci/drivers/e1000
3. change the pci's driver to others, such as change from driver e1000 to driver pci-stub
# virsh nodedev-detach pci_0000_04_00_0 --driver kvm
Device pci_0000_04_00_0 detached
4. check the driver with kernel command again, it is changed to pci-stub
# readlink -f /sys/bus/pci/devices/0000\:04\:00.0/driver/
/sys/bus/pci/drivers/pci-stub
5. check the driver with virsh command nodedev-dumpxml, it didn't be updated.
# virsh nodedev-dumpxml pci_0000_04_00_0
<device>
  <name>pci_0000_04_00_0</name>
  <parent>pci_0000_00_1e_0</parent>
  <driver>
    <name>e1000</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>4</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x107c'>82541PI Gigabit Ethernet Controller</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
  </capability>
</device>


Actual results:
the driver in step 5 didn't be changed to pci-stub

Expected results:
the driver in step 5 should be changed to pci-stub

Comment 2 Laine Stump 2013-06-28 15:00:57 UTC
The problem here is that udev is expecting its device cache to be updated by a "change" event received from udev when a device is changed. However, libvirt only receives "add" events as new devices are added, but doesn't receive any "change" events (I witnessed this by attaching gdb and putting a breakpoint on  udevAddOneDevice(), which should be called for each add and change event).

What I *do* see is the following:

1) during a nodedev-detach, there are 6 "remove" events, nothing else.

2) during a nodedev-reattach, there are 3 "add" and 2 "move" events.

I don't know if the problem is that libvirt's code was written based on incorrect assumptions about the events that udev sends, or if udev is missing some events.

Comment 3 Laine Stump 2013-06-28 17:12:29 UTC
Harald or Kay - I see your names on udev BZ records and documentation, so I thought one of you could provide the answer to this more quickly than a udev noob like me could find it...

wrt Comment 2, as far as I can see in udev documentation, udev should be sending a change even any time any driver property is changed. However, when an SRIOV Virtual Function (attached to the igbvf driver) is detached from that driver and attached to the pci-stub driver, "udevadm monitor" only shows the following events:

KERNEL[33024.212336] remove   /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0/queues/rx-0 (queues)
KERNEL[33024.212442] remove   /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0/queues/tx-0 (queues)
KERNEL[33024.212692] remove   /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0 (net)
UDEV  [33024.214482] remove   /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0/queues/rx-0 (queues)
UDEV  [33024.214577] remove   /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0/queues/tx-0 (queues)
UDEV  [33024.215514] remove   /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0 (net)

However, the udev db has definitely been told that the device's properties have changed. Here is the db entry before switching to pci-stub:

P: /devices/pci0000:00/0000:00:04.0/0000:02:10.1
E: DEVPATH=/devices/pci0000:00/0000:00:04.0/0000:02:10.1
E: DRIVER=igbvf
E: ID_MODEL_FROM_DATABASE=82576 Virtual Function
E: ID_PCI_CLASS_FROM_DATABASE=Network controller
E: ID_PCI_SUBCLASS_FROM_DATABASE=Ethernet controller
E: ID_VENDOR_FROM_DATABASE=Intel Corporation
E: MODALIAS=pci:v00008086d000010CAsv00008086sd0000A03Cbc02sc00i00
E: PCI_CLASS=20000
E: PCI_ID=8086:10CA
E: PCI_SLOT_NAME=0000:02:10.1
E: PCI_SUBSYS_ID=8086:A03C
E: SUBSYSTEM=pci
E: USEC_INITIALIZED=39982509603

And here is the db entry *after* the switch to pci-stub:

P: /devices/pci0000:00/0000:00:04.0/0000:02:10.1
E: DEVPATH=/devices/pci0000:00/0000:00:04.0/0000:02:10.1
E: DRIVER=pci-stub
E: ID_MODEL_FROM_DATABASE=82576 Virtual Function
E: ID_PCI_CLASS_FROM_DATABASE=Network controller
E: ID_PCI_SUBCLASS_FROM_DATABASE=Ethernet controller
E: ID_VENDOR_FROM_DATABASE=Intel Corporation
E: MODALIAS=pci:v00008086d000010CAsv00008086sd0000A03Cbc02sc00i00
E: PCI_CLASS=20000
E: PCI_ID=8086:10CA
E: PCI_SLOT_NAME=0000:02:10.1
E: PCI_SUBSYS_ID=8086:A03C
E: SUBSYSTEM=pci
E: USEC_INITIALIZED=39982509603

The question is this: "Should udev be generating a "change" event as a result of the change in driver that is bound to this device?"

Note that although the original BZ was filed against RHEL7, I've also reproduced it on RHEL6.4 (Bug 979330) and Fedora 19.

Comment 4 Kay Sievers 2013-06-30 13:51:45 UTC
(In reply to Laine Stump from comment #3)
> The question is this: "Should udev be generating a "change" event as a
> result of the change in driver that is bound to this device?"

No.

Udev does never generate any event on its own, all events originate from
the kernel itself, udev only adds properties to kernel events, and can hook
things into the events.

The kernel does not send events for a device if the driver that binds
to the device is updated. So there is no event expected in your case.

Drivers usually create devices below the device they bind to, like a
new network interface below a PCI device, and these devices generate
add and remove events, but the device they bind to will not.

Comment 5 Laine Stump 2013-06-30 18:15:40 UTC
(Thanks for your response. I'm obviously out of my element here :-)

(In reply to Kay Sievers from comment #4)

> Udev does never generate any event on its own, all events originate from
> the kernel itself, udev only adds properties to kernel events, and can hook
> things into the events.

You'll have to forgive my use of "udev" to mean "anything in udev or below", as that's the view from libvirt's point of reference.


> 
> The kernel does not send events for a device if the driver that binds
> to the device is updated. So there is no event expected in your case.

So is the DRIVER setting in the udev db not really stored in any db at all, but simply read from sysfs at the time "udevadm info --export-db" is run?

> 
> Drivers usually create devices below the device they bind to, like a
> new network interface below a PCI device, and these devices generate
> add and remove events, but the device they bind to will not.

What are "change" events used for?


At any rate, it sounds like libvirt needs to be repopulating the driver name from sysfs each time a device's info is requested (just as it does for the HAL driver).

Comment 6 Laine Stump 2013-06-30 18:57:06 UTC
I just posted a fix upstream for review (local testing on F19 shows it to work properly):

https://www.redhat.com/archives/libvir-list/2013-June/msg01250.html

Comment 7 Kay Sievers 2013-06-30 19:13:44 UTC
(In reply to Laine Stump from comment #5)
> So is the DRIVER setting in the udev db not really stored in any db at all,
> but simply read from sysfs at the time "udevadm info --export-db" is run?

It's a property of the kernel, exported in sysfs. Udevadm just reads it
directly from the kernel, it is not stored anywhere else.

> > Drivers usually create devices below the device they bind to, like a
> > new network interface below a PCI device, and these devices generate
> > add and remove events, but the device they bind to will not.
> 
> What are "change" events used for?

It's only explicitly added subsystem-specific code, that sends change events,
they are added to the kernel to send out for only very few devices, devices
which change some state that should be announced to userspace to update its
view.

The main use case is a re-configuration of the device, which which changes
the device's behavior, like an USB device can have multiple entirely
different hardware configs, only one of the configs can be active at a
given time. Changing that config does not change the device in sysfs
itself, but the behavior of the device is entirely different, so userspace
needs to be informed about it.

Another example is device mapper block devices, which start as a,
non-functional "dead" device in sysfs, and after they are configured, they
become real working block devices, so this state change needs to be announced.

> At any rate, it sounds like libvirt needs to be repopulating the driver name
> from sysfs each time a device's info is requested (just as it does for the
> HAL driver).

Sounds like, yes.

In general, I would be very careful with mirroring any kernel state
in userspace tools for a longer time. All these things should just be
re-read all the time they are requested, unless inner-loops will require
caching for performance reasons.

The kernel and today's hardware changes its state all the time, keeping
copies of the kernel device data cached in tools is the wrong thing to
do in most cases.

Comment 8 Laine Stump 2013-07-01 04:33:36 UTC
A fix for this bug was pushed upstream and will be in libvirt-1.1.0:

commit 374c5e4f73577b316fed19bbcefe74b75195eb95
Author: Laine Stump <laine>
Date:   Sun Jun 30 14:49:21 2013 -0400

    node device driver: update driver name during dumpxml
    
    This fixes:
    
      https://bugzilla.redhat.com/show_bug.cgi?id=979290
      https://bugzilla.redhat.com/show_bug.cgi?id=979330
    
    The node device driver was written with the assumption that udev would
    use a "change" event to notify libvirt of any change to device status
    (including the name of the driver it was bound to). It turns out this
    is not the case (see Comment 4 of BZ 979290). That means that a
    dumpxml for a device would always show whatever driver happened to be
    bound at the time libvirt was started (when the node device cache was
    built).
    
    There was already code in the driver (for the benefit of the HAL
    backend) that updated the driver name from sysfs each time a device's
    info was retrieved from the cache. This patch just enables that manual
    update for the udev backend as well.

Comment 9 Xuesong Zhang 2013-07-03 02:56:08 UTC
Verify this bug with build libvirt-1.1.0-1.el7.x86_64

The driver in command nodedev-dumpxml is updated timely, so change the bug status to "verify".

steps:
1. check the pci driver with virsh command nodedev-dumpxml.
# virsh nodedev-dumpxml pci_0000_04_00_0
<device>
  <name>pci_0000_04_00_0</name>
  <parent>pci_0000_00_1e_0</parent>
  <driver>
    <name>e1000</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>4</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x107c'>82541PI Gigabit Ethernet Controller</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
  </capability>
</device>
2. check it's driver with kernel command
# readlink -f /sys/bus/pci/devices/0000\:04\:00.0/driver/
/sys/bus/pci/drivers/e1000
3. change the pci's driver to others, such as change from driver e1000 to driver pci-stub
# virsh nodedev-detach pci_0000_04_00_0 --driver kvm
Device pci_0000_04_00_0 detached
4. check the driver with kernel command again, it is changed to pci-stub
# readlink -f /sys/bus/pci/devices/0000\:04\:00.0/driver/
/sys/bus/pci/drivers/pci-stub
5. check the driver with virsh command nodedev-dumpxml, it didn't be updated.
# virsh nodedev-dumpxml pci_0000_04_00_0
<device>
  <name>pci_0000_04_00_0</name>
  <parent>pci_0000_00_1e_0</parent>
  <driver>
    <name>pci-stub</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>4</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x107c'>82541PI Gigabit Ethernet Controller</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
  </capability>
</device>

Comment 10 tingting zheng 2014-02-12 07:14:29 UTC
I retested the bug with:
libvirt-1.1.1-22.el7.x86_64
qemu-kvm-1.5.3-45.el7.x86_64

steps:
1. check the pci driver with virsh command nodedev-dumpxml.
# virsh nodedev-dumpxml pci_0000_00_19_0
<device>
  <name>pci_0000_00_19_0</name>
  <path>/sys/devices/pci0000:00/0000:00:19.0</path>
  <parent>computer</parent>
  <driver>
    <name>e1000e</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>0</bus>
    <slot>25</slot>
    <function>0</function>
    <product id='0x10de'>82567LM-3 Gigabit Network Connection</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
    <iommuGroup number='3'>
      <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
    </iommuGroup>
  </capability>
</device>

2. check it's driver with kernel command
# readlink -f /sys/bus/pci/devices/0000\:00\:19.0/driver/
/sys/bus/pci/drivers/e1000e

3. change the pci's driver to others.
3.1 test --driver kvm,the error shows clear about "KVM device assignment is currently not supported on this system"
# virsh nodedev-detach pci_0000_00_19_0 --driver kvm
error: Failed to detach device pci_0000_00_19_0
error: argument unsupported: KVM device assignment is currently not supported on this system

3.1 test the default driver.
# virsh nodedev-detach pci_0000_00_19_0 
Device pci_0000_00_19_0 detached

3.2 test --driver vfio.
# virsh nodedev-detach pci_0000_00_19_0 --driver vfio
Device pci_0000_00_19_0 detached

After step 3.1 and 3.2,the xml shows pci driver is updated to vfio-pci.

Comment 11 Ludek Smid 2014-06-13 12:02:44 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.