Bug 1046919 - node device's driver will be lost after nodedev-detach when kernel option 'intel_iommu=on' is not existed
node device's driver will be lost after nodedev-detach when kernel option 'in...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt (Show other bugs)
7.0
x86_64 Linux
medium Severity medium
: rc
: ---
Assigned To: Jiri Denemark
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-12-27 06:44 EST by Jincheng Miao
Modified: 2014-06-17 21:01 EDT (History)
9 users (show)

See Also:
Fixed In Version: libvirt-1.1.1-19.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-13 09:11:30 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jincheng Miao 2013-12-27 06:44:37 EST
Description of problem:
When starting os without intel_iommu=on, node device's driver will be lost after nodedev-detach.

Version-Release number of selected component (if applicable):
libvirt-1.1.1-16.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. start host os without kernel option 'intel_iommu=on'

2. detach a nodedevice (network device) with VFIO backend driver (vfio-pci)
# virsh nodedev-detach pci_0000_00_19_0 --driver vfio
Device pci_0000_00_19_0 detached

3. reattach it
# virsh nodedev-reattach pci_0000_00_19_0
error: Failed to re-attach device pci_0000_00_19_0
error: internal error: Invalid device 0000:00:19.0 driver file /sys/bus/pci/devices/0000:00:19.0/driver is not a symlink

Actual results:
node device can be detached without iommu enabled.

Expected results:
Since iommu is disabled, step 2 should report an error, like:
error: Failed to detach device pci_0000_00_19_0
error: invalid argument: vfio device assignment is not currently supported on this system
Comment 2 zhoujunqin 2013-12-30 05:40:04 EST
I met 4 issues while i test the pci detach on a ordinary machine(not sriov) please help have a look that whether
they work as expect or not, thanks

1.Fail to detach the pci on a ordinary machine while i detach the pci without specify the pci driver
# lsmod|grep vfio
# lsmod|grep kvm
kvm_intel             138567  0
kvm                   424072  1 kvm_intel

# virsh nodedev-detach pci_0000_00_19_0
 error: Failed to detach device pci_0000_00_19_0
error: invalid argument: neither VFIO nor kvm device assignment is currently supported on this system

2.The pci can be detached successfully while i specify the pci driver (vfio or kvm), and it didn't
report error eventif i detach 1 pci for serval times

 # virsh nodedev-detach pci_0000_00_19_0  --driver vfio
 Devices pci_0000_00_19_0 detache

 # virsh nodedev-detach pci_0000_00_19_0  --driver vfio
 Devices pci_0000_00_19_0 detache

 # virsh nodedev-detach pci_0000_00_19_0  --driver kvm
 Devices pci_0000_00_19_0 detache

 #virsh nodedev-reattach pci_0000_00_19_0
 Devices pci_0000_00_19_0 re-attached

3.The pci can be detached successfully even if i didn't specify the driver after i have excuted the step2's command before
# lsmod|grep vfio
vfio_iommu_type1       17636  0
vfio_pci               36474  0
vfio                   20777  2 vfio_iommu_type1,vfio_pci
# lsmod|grep kvm
kvm_intel             138567  0
kvm                   424072  1 kvm_intel

 # virsh nodedev-detach pci_0000_00_19_0  
 Devices pci_0000_00_19_0 detache

 #virsh nodedev-reattach pci_0000_00_19_0
 Devices pci_0000_00_19_0 re-attached

4.No matter i enable the intel_iommu=on or not, the pci can be detached successfully while i detach the pci with the pci driver
# lsmod|grep vfio
# lsmod|grep kvm
kvm_intel             138567  0
kvm                   424072  1 kvm_intel

# virsh nodedev-detach pci_0000_00_19_0
 error: Failed to detach device pci_0000_00_19_0
error: invalid argument: neither VFIO nor kvm device assignment is currently supported on this system

 # virsh nodedev-detach pci_0000_00_19_0  --driver vfio
 Devices pci_0000_00_19_0 detache

 # virsh nodedev-detach pci_0000_00_19_0  --driver vfio
 Devices pci_0000_00_19_0 detache

 # virsh nodedev-detach pci_0000_00_19_0  --driver kvm
 Devices pci_0000_00_19_0 detache
Comment 3 dyuan 2013-12-31 05:13:35 EST
(In reply to zhoujunqin from comment #2)
> I met 4 issues while i test the pci detach on a ordinary machine(not sriov)
> please help have a look that whether
> they work as expect or not, thanks
> 
> 1.Fail to detach the pci on a ordinary machine while i detach the pci
> without specify the pci driver
> # lsmod|grep vfio
> # lsmod|grep kvm
> kvm_intel             138567  0
> kvm                   424072  1 kvm_intel
> 
> # virsh nodedev-detach pci_0000_00_19_0
>  error: Failed to detach device pci_0000_00_19_0
> error: invalid argument: neither VFIO nor kvm device assignment is currently
> supported on this system

The error msg is clear enough since you didn't load vfio and also not enable iommu.

> 
> 2.The pci can be detached successfully while i specify the pci driver (vfio
> or kvm), and it didn't
> report error eventif i detach 1 pci for serval times
> 
>  # virsh nodedev-detach pci_0000_00_19_0  --driver vfio
>  Devices pci_0000_00_19_0 detache
> 
>  # virsh nodedev-detach pci_0000_00_19_0  --driver vfio
>  Devices pci_0000_00_19_0 detache
> 
>  # virsh nodedev-detach pci_0000_00_19_0  --driver kvm
>  Devices pci_0000_00_19_0 detache
> 
>  #virsh nodedev-reattach pci_0000_00_19_0
>  Devices pci_0000_00_19_0 re-attached
> 
> 3.The pci can be detached successfully even if i didn't specify the driver
> after i have excuted the step2's command before
> # lsmod|grep vfio
> vfio_iommu_type1       17636  0
> vfio_pci               36474  0
> vfio                   20777  2 vfio_iommu_type1,vfio_pci
> # lsmod|grep kvm
> kvm_intel             138567  0
> kvm                   424072  1 kvm_intel
> 
>  # virsh nodedev-detach pci_0000_00_19_0  
>  Devices pci_0000_00_19_0 detache
> 
>  #virsh nodedev-reattach pci_0000_00_19_0
>  Devices pci_0000_00_19_0 re-attached
> 
> 4.No matter i enable the intel_iommu=on or not, the pci can be detached
> successfully while i detach the pci with the pci driver

It's the same as bug description.
also duplicate with step 2 in your comment ?

> # lsmod|grep vfio
> # lsmod|grep kvm
> kvm_intel             138567  0
> kvm                   424072  1 kvm_intel
> 
> # virsh nodedev-detach pci_0000_00_19_0
>  error: Failed to detach device pci_0000_00_19_0
> error: invalid argument: neither VFIO nor kvm device assignment is currently
> supported on this system
> 
>  # virsh nodedev-detach pci_0000_00_19_0  --driver vfio
>  Devices pci_0000_00_19_0 detache
> 
>  # virsh nodedev-detach pci_0000_00_19_0  --driver vfio
>  Devices pci_0000_00_19_0 detache
> 
>  # virsh nodedev-detach pci_0000_00_19_0  --driver kvm
>  Devices pci_0000_00_19_0 detache
Comment 4 Xuesong Zhang 2014-01-06 01:35:07 EST
Please ignore the comment 2, the scenario 2 is same with the bug description, the result in other scenarios are as expected.
Comment 5 Jincheng Miao 2014-01-16 04:38:13 EST
Hi Jiri,

I sent a fix patch to the upstream, it works when iommu is off. Hope that is helpful.
http://www.redhat.com/archives/libvir-list/2014-January/msg00708.html
Comment 6 Jiri Denemark 2014-01-17 05:42:23 EST
Patches sent upstream for review (a polished version of the patch from comment 5 is included in the series): https://www.redhat.com/archives/libvir-list/2014-January/msg00784.html
Comment 7 Jiri Denemark 2014-01-20 08:34:10 EST
This is now fixed upstream by commits v1.2.1-24-gc982e5e..v1.2.1-32-gb70c093:

commit c982e5e84f47d3e71d794bb034ef2489091f41c6
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Wed Jan 15 11:44:53 2014 +0100

    pci: Make reattach work for unbound devices
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1046919
    
    When a PCI device is not bound to any driver, reattach should just
    trigger driver probe rather than failing with
    
        Invalid device 0000:00:19.0 driver file
        /sys/bus/pci/devices/0000:00:19.0/driver is not a symlink
    
    While virPCIDeviceGetDriverPathAndName was documented to return success
    and NULL driver and path when a device is not attached to any driver but
    didn't do so. Thus callers could not distinguish unbound devices from
    failures.
    
    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit d8ab981bdd137f15675ee0d101abeabf42680cc1
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Thu Jan 16 20:08:00 2014 +0100

    pci: Fix failure paths in detach
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1046919
    
    Since commit v0.9.0-47-g4e8969e (released in 0.9.1) some failures during
    device detach were reported to callers of virPCIDeviceBindToStub as
    success. For example, even though a device seemed to be detached
    
        virsh # nodedev-detach pci_0000_07_05_0 --driver vfio
        Device pci_0000_07_05_0 detached
    
    one could find similar message in libvirt logs:
    
        Failed to bind PCI device '0000:07:05.0' to vfio-pci: No such device
    
    This patch fixes these paths and also avoids overwriting real errors
    with errors encountered during a cleanup phase.

commit df8022721ef09b2e0bd06e16c7d45ff99034f761
Author: Jincheng Miao <jmiao@redhat.com>
Date:   Thu Jan 16 16:59:50 2014 +0800

    qemu: Don't detach devices if passthrough doesn't work
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1046919
    
    If none (KVM, VFIO) of the supported PCI passthrough methods is known to
    work on a host, it's better to fail right away with a nice error message
    rather than letting attachment fail with a more cryptic message such as
    
        Failed to bind PCI device '0000:07:05.0' to vfio-pci: No such device
    
    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 44bfe3574a612e9cab288fa57522c854c701540b
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Tue Jan 14 14:59:37 2014 +0100

    virpcitest: Show PCI device tested by each test
    
    For example:
    
     ...
     5) testVirPCIDeviceIsAssignable(0005:90:01.0)      ... OK
     6) testVirPCIDeviceIsAssignable(0001:01:00.0)      ... OK
    
    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 124affae84539ac3f31a366a96f427df86674d33
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Thu Jan 16 12:27:23 2014 +0100

    pci: Publish some internal code for virpcitest
    
    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 508b566ec24ce3df6ec9c4a31bb7512bc0177097
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Thu Jan 16 12:28:12 2014 +0100

    virpcimock: Mock /sys/bus/pci/drivers_probe
    
    This file is used by PCI detach and reattach APIs to probe for a driver
    that handles a specific device.
    
    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit bbeadb820c14514bd50392e79844b8fc53fb024e
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Wed Jan 15 10:20:55 2014 +0100

    virpcitest: More tests for device detach and reattach
    
    Especially for devices that are not bound to any driver.
    
    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit b803b29c1a57856a6ab4d2c6ae268c093826a2de
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Thu Jan 16 14:05:19 2014 +0100

    virpcimock: Add PCI driver which always fails
    
    Such driver can be used to make sure PCI APIs fail properly.
    
    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit b70c093ffa00cd87c8d39d3652b798f033a81faf
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Thu Jan 16 14:06:22 2014 +0100

    virpcitest: Test virPCIDeviceDetach failure
    
    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Comment 10 Hu Jianwei 2014-01-24 01:27:13 EST
I can reproduce it on libvirt-1.1.1-18.el7.x86_64, but can not reproduce it on below version:
libvirt-1.1.1-19.el7.x86_64
qemu-kvm-1.5.3-41.el7.x86_64
kernel-3.10.0-78.el7.x86_64

1. Don't detach pci devices when iommu=off
[root@sriov2 ~]# grep -irn "iommu" /var/log/messages
19211:Jan 24 11:55:48 sriov2 kernel: [    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.10.0-78.el7.x86_64 root=UUID=47dce99f-353c-45d2-9b91-88e1b9e54b71 ro vconsole.font=latarcyrheb-sun16 crashkernel=auto vconsole.keymap=us rhgb quiet intel_iommu=off
19212:Jan 24 11:55:48 sriov2 kernel: [    0.000000] Intel-IOMMU: disabled
...(clipped)

[root@sriov2 ~]# virsh nodedev-detach pci_0000_01_00_0 --driver vfio
error: Failed to detach device pci_0000_01_00_0
error: argument unsupported: VFIO device assignment is currently not supported on this system

[root@sriov2 ~]# virsh nodedev-detach pci_0000_01_00_0 --driver kvm
error: Failed to detach device pci_0000_01_00_0
error: argument unsupported: KVM device assignment is currently not supported on this system

[root@sriov2 ~]# virsh nodedev-detach pci_0000_01_00_0 
error: Failed to detach device pci_0000_01_00_0
error: Operation not supported: neither VFIO nor KVM device assignment is currently supported on this system

2. Both iommu=on/off, the none driver pci devices can reattach back to kernel driver.
[root@sriov2 ~]# grep -irn "iommu" /var/log/messages
19211:Jan 24 11:55:48 sriov2 kernel: [    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.10.0-78.el7.x86_64 root=UUID=47dce99f-353c-45d2-9b91-88e1b9e54b71 ro vconsole.font=latarcyrheb-sun16 crashkernel=auto vconsole.keymap=us rhgb quiet intel_iommu=off
19212:Jan 24 11:55:48 sriov2 kernel: [    0.000000] Intel-IOMMU: disabled
...(clipped) 
[root@sriov2 ~]# echo 0000:01:00.0 > /sys/bus/pci/devices/0000\:01\:00.0/driver/unbind
[root@sriov2 ~]# virsh nodedev-dumpxml pci_0000_01_00_0 
<device>
  <name>pci_0000_01_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0</path>
  <parent>pci_0000_00_1c_0</parent>
  <capability type='pci'>
    <domain>0</domain>
    <bus>1</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x10d3'>82574L Gigabit Network Connection</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
  </capability>
</device>


[root@sriov2 ~]# virsh nodedev-reattach pci_0000_01_00_0 
Device pci_0000_01_00_0 re-attached

[root@sriov2 ~]# virsh nodedev-dumpxml pci_0000_01_00_0 
<device>
  <name>pci_0000_01_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0</path>
  <parent>pci_0000_00_1c_0</parent>
  <driver>
    <name>e1000e</name>
  </driver>
...(clipped)

[root@sriov2 ~]# grep -irn "iommu" /var/log/messages
21184:Jan 24 12:50:38 sriov2 kernel: [    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.10.0-78.el7.x86_64 root=UUID=47dce99f-353c-45d2-9b91-88e1b9e54b71 ro vconsole.font=latarcyrheb-sun16 crashkernel=auto vconsole.keymap=us rhgb quiet intel_iommu=on
21185:Jan 24 12:50:38 sriov2 kernel: [    0.000000] Intel-IOMMU: enabled
...(clipped)

[root@sriov2 ~]# echo 0000:01:00.0 > /sys/bus/pci/devices/0000\:01\:00.0/driver/unbind
[root@sriov2 ~]# virsh nodedev-dumpxml pci_0000_01_00_0
<device>
  <name>pci_0000_01_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0</path>
  <parent>pci_0000_00_1c_0</parent>
  <capability type='pci'>
    <domain>0</domain>
    <bus>1</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x10d3'>82574L Gigabit Network Connection</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
    <iommuGroup number='11'>
...(clipped)

[root@sriov2 ~]# virsh nodedev-reattach pci_0000_01_00_0
Device pci_0000_01_00_0 re-attached

[root@sriov2 ~]# virsh nodedev-dumpxml pci_0000_01_00_0
<device>
  <name>pci_0000_01_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0</path>
  <parent>pci_0000_00_1c_0</parent>
  <driver>
    <name>e1000e</name>
  </driver>
...(clipped) 

3. When iommu=on, the latest libvirt can not detach to kvm driver(pci-stub) on the latest kernel version(kernel-3.10.0-78.el7.x86_64).
[root@sriov2 ~]# grep -irn "iommu" /var/log/messages
21184:Jan 24 12:50:38 sriov2 kernel: [    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.10.0-78.el7.x86_64 root=UUID=47dce99f-353c-45d2-9b91-88e1b9e54b71 ro vconsole.font=latarcyrheb-sun16 crashkernel=auto vconsole.keymap=us rhgb quiet intel_iommu=on
21185:Jan 24 12:50:38 sriov2 kernel: [    0.000000] Intel-IOMMU: enabled
...(clipped)
[root@sriov2 ~]# virsh nodedev-detach  pci_0000_01_00_0 --driver kvm
error: Failed to detach device pci_0000_01_00_0
error: argument unsupported: KVM device assignment is currently not supported on this system

4. With an old kernel version(kernel-3.10.0-9.el7.x86_64), and iommu=on, the latest libvirt can detach to kvm driver(pci-stub)
[root@sriov2 ~]# uname -r
3.10.0-9.el7.x86_64
[root@sriov2 ~]# modprobe -r kvm_intel
[root@sriov2 ~]# modprobe -r kvm
[root@sriov2 ~]# modprobe kvm allow_unsafe_assigned_interrupts=1
[root@sriov2 ~]# modprobe kvm_inte

[root@sriov2 ~]# virsh nodedev-dumpxml pci_0000_01_00_0
<device>
  <name>pci_0000_01_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0</path>
  <parent>pci_0000_00_1c_0</parent>
  <driver>
    <name>e1000e</name>
  </driver>
...

[root@sriov2 ~]# virsh nodedev-detach  pci_0000_01_00_0 --driver kvm
Device pci_0000_01_00_0 detached

[root@sriov2 ~]# virsh nodedev-dumpxml pci_0000_01_00_0
<device>
  <name>pci_0000_01_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0</path>
  <parent>pci_0000_00_1c_0</parent>
  <driver>
    <name>pci-stub</name>
  </driver>
  <capability type='pci'>
...(clipped)

We can get expected results, changed to verified.
Comment 11 Ludek Smid 2014-06-13 09:11:30 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.