Bug 979290
Summary: | The driver in output of virsh command nodedev-dumpxml should be updated timely | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Xuesong Zhang <xuzhang> | |
Component: | libvirt | Assignee: | Laine Stump <laine> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 7.0 | CC: | acathrow, cwei, dyuan, harald, honzhang, kay, laine, tzheng | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-1.1.0-1.el7 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 979330 (view as bug list) | Environment: | ||
Last Closed: | 2014-06-13 12:02:44 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 979330 |
Description
Xuesong Zhang
2013-06-28 07:32:06 UTC
The problem here is that udev is expecting its device cache to be updated by a "change" event received from udev when a device is changed. However, libvirt only receives "add" events as new devices are added, but doesn't receive any "change" events (I witnessed this by attaching gdb and putting a breakpoint on udevAddOneDevice(), which should be called for each add and change event). What I *do* see is the following: 1) during a nodedev-detach, there are 6 "remove" events, nothing else. 2) during a nodedev-reattach, there are 3 "add" and 2 "move" events. I don't know if the problem is that libvirt's code was written based on incorrect assumptions about the events that udev sends, or if udev is missing some events. Harald or Kay - I see your names on udev BZ records and documentation, so I thought one of you could provide the answer to this more quickly than a udev noob like me could find it... wrt Comment 2, as far as I can see in udev documentation, udev should be sending a change even any time any driver property is changed. However, when an SRIOV Virtual Function (attached to the igbvf driver) is detached from that driver and attached to the pci-stub driver, "udevadm monitor" only shows the following events: KERNEL[33024.212336] remove /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0/queues/rx-0 (queues) KERNEL[33024.212442] remove /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0/queues/tx-0 (queues) KERNEL[33024.212692] remove /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0 (net) UDEV [33024.214482] remove /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0/queues/rx-0 (queues) UDEV [33024.214577] remove /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0/queues/tx-0 (queues) UDEV [33024.215514] remove /devices/pci0000:00/0000:00:04.0/0000:02:10.1/net/p4p2_0 (net) However, the udev db has definitely been told that the device's properties have changed. Here is the db entry before switching to pci-stub: P: /devices/pci0000:00/0000:00:04.0/0000:02:10.1 E: DEVPATH=/devices/pci0000:00/0000:00:04.0/0000:02:10.1 E: DRIVER=igbvf E: ID_MODEL_FROM_DATABASE=82576 Virtual Function E: ID_PCI_CLASS_FROM_DATABASE=Network controller E: ID_PCI_SUBCLASS_FROM_DATABASE=Ethernet controller E: ID_VENDOR_FROM_DATABASE=Intel Corporation E: MODALIAS=pci:v00008086d000010CAsv00008086sd0000A03Cbc02sc00i00 E: PCI_CLASS=20000 E: PCI_ID=8086:10CA E: PCI_SLOT_NAME=0000:02:10.1 E: PCI_SUBSYS_ID=8086:A03C E: SUBSYSTEM=pci E: USEC_INITIALIZED=39982509603 And here is the db entry *after* the switch to pci-stub: P: /devices/pci0000:00/0000:00:04.0/0000:02:10.1 E: DEVPATH=/devices/pci0000:00/0000:00:04.0/0000:02:10.1 E: DRIVER=pci-stub E: ID_MODEL_FROM_DATABASE=82576 Virtual Function E: ID_PCI_CLASS_FROM_DATABASE=Network controller E: ID_PCI_SUBCLASS_FROM_DATABASE=Ethernet controller E: ID_VENDOR_FROM_DATABASE=Intel Corporation E: MODALIAS=pci:v00008086d000010CAsv00008086sd0000A03Cbc02sc00i00 E: PCI_CLASS=20000 E: PCI_ID=8086:10CA E: PCI_SLOT_NAME=0000:02:10.1 E: PCI_SUBSYS_ID=8086:A03C E: SUBSYSTEM=pci E: USEC_INITIALIZED=39982509603 The question is this: "Should udev be generating a "change" event as a result of the change in driver that is bound to this device?" Note that although the original BZ was filed against RHEL7, I've also reproduced it on RHEL6.4 (Bug 979330) and Fedora 19. (In reply to Laine Stump from comment #3) > The question is this: "Should udev be generating a "change" event as a > result of the change in driver that is bound to this device?" No. Udev does never generate any event on its own, all events originate from the kernel itself, udev only adds properties to kernel events, and can hook things into the events. The kernel does not send events for a device if the driver that binds to the device is updated. So there is no event expected in your case. Drivers usually create devices below the device they bind to, like a new network interface below a PCI device, and these devices generate add and remove events, but the device they bind to will not. (Thanks for your response. I'm obviously out of my element here :-) (In reply to Kay Sievers from comment #4) > Udev does never generate any event on its own, all events originate from > the kernel itself, udev only adds properties to kernel events, and can hook > things into the events. You'll have to forgive my use of "udev" to mean "anything in udev or below", as that's the view from libvirt's point of reference. > > The kernel does not send events for a device if the driver that binds > to the device is updated. So there is no event expected in your case. So is the DRIVER setting in the udev db not really stored in any db at all, but simply read from sysfs at the time "udevadm info --export-db" is run? > > Drivers usually create devices below the device they bind to, like a > new network interface below a PCI device, and these devices generate > add and remove events, but the device they bind to will not. What are "change" events used for? At any rate, it sounds like libvirt needs to be repopulating the driver name from sysfs each time a device's info is requested (just as it does for the HAL driver). I just posted a fix upstream for review (local testing on F19 shows it to work properly): https://www.redhat.com/archives/libvir-list/2013-June/msg01250.html (In reply to Laine Stump from comment #5) > So is the DRIVER setting in the udev db not really stored in any db at all, > but simply read from sysfs at the time "udevadm info --export-db" is run? It's a property of the kernel, exported in sysfs. Udevadm just reads it directly from the kernel, it is not stored anywhere else. > > Drivers usually create devices below the device they bind to, like a > > new network interface below a PCI device, and these devices generate > > add and remove events, but the device they bind to will not. > > What are "change" events used for? It's only explicitly added subsystem-specific code, that sends change events, they are added to the kernel to send out for only very few devices, devices which change some state that should be announced to userspace to update its view. The main use case is a re-configuration of the device, which which changes the device's behavior, like an USB device can have multiple entirely different hardware configs, only one of the configs can be active at a given time. Changing that config does not change the device in sysfs itself, but the behavior of the device is entirely different, so userspace needs to be informed about it. Another example is device mapper block devices, which start as a, non-functional "dead" device in sysfs, and after they are configured, they become real working block devices, so this state change needs to be announced. > At any rate, it sounds like libvirt needs to be repopulating the driver name > from sysfs each time a device's info is requested (just as it does for the > HAL driver). Sounds like, yes. In general, I would be very careful with mirroring any kernel state in userspace tools for a longer time. All these things should just be re-read all the time they are requested, unless inner-loops will require caching for performance reasons. The kernel and today's hardware changes its state all the time, keeping copies of the kernel device data cached in tools is the wrong thing to do in most cases. A fix for this bug was pushed upstream and will be in libvirt-1.1.0: commit 374c5e4f73577b316fed19bbcefe74b75195eb95 Author: Laine Stump <laine> Date: Sun Jun 30 14:49:21 2013 -0400 node device driver: update driver name during dumpxml This fixes: https://bugzilla.redhat.com/show_bug.cgi?id=979290 https://bugzilla.redhat.com/show_bug.cgi?id=979330 The node device driver was written with the assumption that udev would use a "change" event to notify libvirt of any change to device status (including the name of the driver it was bound to). It turns out this is not the case (see Comment 4 of BZ 979290). That means that a dumpxml for a device would always show whatever driver happened to be bound at the time libvirt was started (when the node device cache was built). There was already code in the driver (for the benefit of the HAL backend) that updated the driver name from sysfs each time a device's info was retrieved from the cache. This patch just enables that manual update for the udev backend as well. Verify this bug with build libvirt-1.1.0-1.el7.x86_64 The driver in command nodedev-dumpxml is updated timely, so change the bug status to "verify". steps: 1. check the pci driver with virsh command nodedev-dumpxml. # virsh nodedev-dumpxml pci_0000_04_00_0 <device> <name>pci_0000_04_00_0</name> <parent>pci_0000_00_1e_0</parent> <driver> <name>e1000</name> </driver> <capability type='pci'> <domain>0</domain> <bus>4</bus> <slot>0</slot> <function>0</function> <product id='0x107c'>82541PI Gigabit Ethernet Controller</product> <vendor id='0x8086'>Intel Corporation</vendor> </capability> </device> 2. check it's driver with kernel command # readlink -f /sys/bus/pci/devices/0000\:04\:00.0/driver/ /sys/bus/pci/drivers/e1000 3. change the pci's driver to others, such as change from driver e1000 to driver pci-stub # virsh nodedev-detach pci_0000_04_00_0 --driver kvm Device pci_0000_04_00_0 detached 4. check the driver with kernel command again, it is changed to pci-stub # readlink -f /sys/bus/pci/devices/0000\:04\:00.0/driver/ /sys/bus/pci/drivers/pci-stub 5. check the driver with virsh command nodedev-dumpxml, it didn't be updated. # virsh nodedev-dumpxml pci_0000_04_00_0 <device> <name>pci_0000_04_00_0</name> <parent>pci_0000_00_1e_0</parent> <driver> <name>pci-stub</name> </driver> <capability type='pci'> <domain>0</domain> <bus>4</bus> <slot>0</slot> <function>0</function> <product id='0x107c'>82541PI Gigabit Ethernet Controller</product> <vendor id='0x8086'>Intel Corporation</vendor> </capability> </device> I retested the bug with: libvirt-1.1.1-22.el7.x86_64 qemu-kvm-1.5.3-45.el7.x86_64 steps: 1. check the pci driver with virsh command nodedev-dumpxml. # virsh nodedev-dumpxml pci_0000_00_19_0 <device> <name>pci_0000_00_19_0</name> <path>/sys/devices/pci0000:00/0000:00:19.0</path> <parent>computer</parent> <driver> <name>e1000e</name> </driver> <capability type='pci'> <domain>0</domain> <bus>0</bus> <slot>25</slot> <function>0</function> <product id='0x10de'>82567LM-3 Gigabit Network Connection</product> <vendor id='0x8086'>Intel Corporation</vendor> <iommuGroup number='3'> <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/> </iommuGroup> </capability> </device> 2. check it's driver with kernel command # readlink -f /sys/bus/pci/devices/0000\:00\:19.0/driver/ /sys/bus/pci/drivers/e1000e 3. change the pci's driver to others. 3.1 test --driver kvm,the error shows clear about "KVM device assignment is currently not supported on this system" # virsh nodedev-detach pci_0000_00_19_0 --driver kvm error: Failed to detach device pci_0000_00_19_0 error: argument unsupported: KVM device assignment is currently not supported on this system 3.1 test the default driver. # virsh nodedev-detach pci_0000_00_19_0 Device pci_0000_00_19_0 detached 3.2 test --driver vfio. # virsh nodedev-detach pci_0000_00_19_0 --driver vfio Device pci_0000_00_19_0 detached After step 3.1 and 3.2,the xml shows pci driver is updated to vfio-pci. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |