Bug 1025700

Summary: qemu-kvm hang while hot unplug VF after release VF in host.
Product: Red Hat Enterprise Linux 7 Reporter: Xu Han <xuhan>
Component: qemu-kvmAssignee: Bandan Das <bdas>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: medium    
Version: 7.0CC: acathrow, alex.williamson, chayang, hhuang, juzhang, michen, mrezanin, sluo, virt-maint, xfu, xuhan
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 12:42:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Xu Han 2013-11-01 10:44:20 UTC
Description of problem:
qemu-kvm hang while hot unplug VF after release VF in host.
# lspci | grep 82599
05:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
05:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
05:10.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.2 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

Version-Release number of selected component (if applicable):
kernel-3.10.0-40.el7.x86_64
qemu-kvm-rhev-1.5.3-10.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. boot guest with VF (vfio).
# /usr/libexec/qemu-kvm -nodefaults -M pc -m 2G -cpu Nehalem -smp 4,cores=2,threads=2,sockets=1 -boot menu=on -monitor stdio -vga qxl -spice disable-ticketing,port=5931 -drive file=/home/vfio-RHEL7.0-64.qcow2_v3,id=guest-img,if=none,cache=none,aio=native -device virtio-blk-pci,scsi=off,drive=guest-img,id=os-disk,bootindex=1 -device virtio-balloon-pci,id=balloon -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -qmp tcp:0:5555,server,nowait -serial unix:/tmp/guest-sock,server,nowait \
-device vfio-pci,host=05:10.0,id=vf0

2. release VF in host.
Scenario 1, unbind its parent PF.
# echo "8086 10fb" > /sys/bus/pci/drivers/vfio-pci/new_id
# echo "0000:05:00.0" > /sys/bus/pci/devices/0000:05:00.0/driver/unbind
# echo "0000:05:00.0" > /sys/bus/pci/drivers/vfio-pci/bind

Scenario 2, change the number of VF through sysfs.
# echo 0 > /sys/bus/pci/devices/0000\:05\:00.0/sriov_numvfs

3. hot unplug VF.
(qemu) device_del vf0


Actual results:
After step2, cannot release VF.
After step3, hot unplug success and qemu-kvm hang.
# lspci | grep 82599
05:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
05:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
05:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.2 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)


Expected results:
qemu-kvm should not hang.

Additional info:
After step2, guest could shutdown succeed and released VF.

Comment 2 Bandan Das 2014-01-30 16:05:23 UTC
Please confirm how you create the VFs, using "max_vfs" to modprobe or using the sysfs interface. Also please confirm if you can reproduce with both the above methods of creating VFs.

Comment 5 Xu Han 2014-02-12 05:09:50 UTC
Tested this issue with kernel-debug-3.10.0-86.el7.x86_64

Scenario 1, create VFs via modprobe
1. create VFs
# modprobe ixgbe max_vfs=2

2. check VFs on host
# lspci | grep "Virtual Function"
05:10.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.2 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

3. bind one VF to vfio-pci
# echo "0000:05:10.0" > /sys/bus/pci/devices/0000:05:10.0/driver/unbind
# echo "8086 10ed" > /sys/bus/pci/drivers/vfio-pci/new_id

4. boot guest with assigned VF
# /usr/libexec/qemu-kvm ...\
  -device vfio-pci,host=0000:05:10.0,id=vf0

5. release VFs of PF '05:00.0' via sysfs interface on host
# echo 0 > /sys/bus/pci/devices/0000\:05\:00.0/sriov_numvfs
-- This process would hung.

6. check VFs on host
# lspci | grep "Virtual Function"
05:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.2 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

7. reboot guest
-- While guest booting, these messages below would be noticed.
qemu-kvm: vfio: hot reset info failed: Operation not permitted
qemu-kvm: vfio: hot reset info failed: Operation not permitted
[ 1455.284560] vfio_pci_disable: Failed to reset device 0000:05:10.0 (-11)

8. hot unplug VF
(qmp) { 'execute' : 'device_del', 'arguments' : { 'id' : 'vf0' } }
{"return": {}}
{"timestamp": {"seconds": 1392176134, "microseconds": 399986}, "event": "DEVICE_DELETED", "data": {"device": "vf0", "path": "/machine/peripheral/vf0"}}
-- After hot unplug VF, the release process in step5 returned, qemu-kvm not hung.

9. check VFs
# lspci | grep "Virtual Function"
05:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
05:10.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
-- VFs of PF '05:00.0' released succeed.

10. check dmesg on host
# dmesg
...
[ 1201.604169] INFO: task bash:4253 blocked for more than 120 seconds.
[ 1201.610541] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1201.618459] bash            D 0000000000000000  5232  4253   4249 0x00000080
[ 1201.618466]  ffff8802222e7c10 0000000000000046 00000000001d5540 ffff8802222e7fd8
[ 1201.618472]  ffff8802222e7fd8 00000000001d5540 ffff880226c08000 ffff880221c0ce40
[ 1201.618477]  ffff880221d1ec00 ffff880222616098 ffff880221d1f000 ffff880226383a00
[ 1201.618487] Call Trace:
[ 1201.618497]  [<ffffffff81683159>] schedule+0x29/0x70
[ 1201.618505]  [<ffffffffa05a5312>] vfio_del_group_dev+0xc2/0x150 [vfio]
[ 1201.618510]  [<ffffffff81098270>] ? wake_up_bit+0x30/0x30
[ 1201.618515]  [<ffffffffa05ac12b>] vfio_pci_remove+0x1b/0x40 [vfio_pci]
[ 1201.618521]  [<ffffffff81345efb>] pci_device_remove+0x3b/0xb0
[ 1201.618528]  [<ffffffff8140c68f>] __device_release_driver+0x7f/0xf0
[ 1201.618534]  [<ffffffff8140c725>] device_release_driver+0x25/0x40
[ 1201.618538]  [<ffffffff8140be8c>] bus_remove_device+0x11c/0x1a0
[ 1201.618541]  [<ffffffff814086c2>] device_del+0x142/0x1e0
[ 1201.618547]  [<ffffffff8133f414>] pci_stop_bus_device+0x94/0xa0
[ 1201.618551]  [<ffffffff8133f502>] pci_stop_and_remove_bus_device+0x12/0x20
[ 1201.618560]  [<ffffffff8135f27f>] virtfn_remove+0xef/0x180
[ 1201.618566]  [<ffffffff8135fa44>] pci_disable_sriov+0x74/0x140
[ 1201.618577]  [<ffffffffa0579ef3>] ixgbe_disable_sriov+0x83/0x160 [ixgbe]
[ 1201.618586]  [<ffffffffa057a2d7>] ixgbe_pci_sriov_configure+0x77/0x180 [ixgbe]
[ 1201.618594]  [<ffffffff813478c7>] sriov_numvfs_store+0xc7/0x130
[ 1201.618598]  [<ffffffff814076b8>] dev_attr_store+0x18/0x30
[ 1201.618604]  [<ffffffff8126df0b>] sysfs_write_file+0xdb/0x150
[ 1201.618608]  [<ffffffff811ee090>] vfs_write+0xc0/0x1f0
[ 1201.618613]  [<ffffffff8120e607>] ? fget_light+0x3a7/0x510
[ 1201.618616]  [<ffffffff811eea9c>] SyS_write+0x4c/0xa0
[ 1201.618620]  [<ffffffff8168f659>] system_call_fastpath+0x16/0x1b
[ 1201.618624] 5 locks held by bash/4253:
[ 1201.618627]  #0:  (sb_writers#3){.+.+.+}, at: [<ffffffff811ee18b>] vfs_write+0x1bb/0x1f0
[ 1201.618634]  #1:  (&buffer->mutex){+.+.+.}, at: [<ffffffff8126de6c>] sysfs_write_file+0x3c/0x150
[ 1201.618643]  #2:  (s_active#225){.+.+.+}, at: [<ffffffff8126def3>] sysfs_write_file+0xc3/0x150
[ 1201.618654]  #3:  (&iov->lock){+.+.+.}, at: [<ffffffff8135f277>] virtfn_remove+0xe7/0x180
[ 1201.618661]  #4:  (&__lockdep_no_validate__){......}, at: [<ffffffff8140c71d>] device_release_driver+0x1d/0x40
...

Scenario 2, create VFs via sysfs interface
1. create VFs
# echo 2 > /sys/bus/pci/devices/0000\:05\:00.0/sriov_numvfs

step2 to 10 were same as Scenario 1.

Test results were same as Scenario 1.

--
Unloading ixgbe module or binding parent PF to vfio-pci while VM running with assigned VF would not hit this issue as well in current kernel. After previous operating returned, VFs would not released, and qemu-kvm not hung.

Comment 6 Chao Yang 2014-02-12 08:28:39 UTC
Additional info:

While trying to verify Bug 1045175, I hit same log "qemu-kvm: vfio: hot reset info failed: Operation not permitted" with intel dual port 82576 SRIOV NIC, but no calltrace in dmesg.

Comment 7 Bandan Das 2014-02-12 08:56:22 UTC
(In reply to xuhan from comment #5)

> --
> Unloading ixgbe module or binding parent PF to vfio-pci while VM running
> with assigned VF would not hit this issue as well in current kernel. After
> previous operating returned, VFs would not released, and qemu-kvm not hung.

From your comment above, can you clarify if except for the messages : 
qemu-kvm: vfio: hot reset info failed: Operation not permitted
qemu-kvm: vfio: hot reset info failed: Operation not permitted

do you think there's any other issue with the new kernel ? It looks like you don't see the qemu-kvm hang anymore right and so the problem is fixed ? Or am I missing something..

Comment 8 Xu Han 2014-02-12 10:08:34 UTC
This issue itself is fixed by the new kernel. I just not sure whether exist some operations between step5 and 8 would lead to other issue.

Comment 9 Bandan Das 2014-02-12 10:22:36 UTC
(In reply to xuhan from comment #8)
> This issue itself is fixed by the new kernel. I just not sure whether exist
> some operations between step5 and 8 would lead to other issue.

Between step 5 and 8, I think the only potential issue is step 7 -
7. reboot guest
-- While guest booting, these messages below would be noticed.
qemu-kvm: vfio: hot reset info failed: Operation not permitted
qemu-kvm: vfio: hot reset info failed: Operation not permitted
[ 1455.284560] vfio_pci_disable: Failed to reset device 0000:05:10.0 (-11)

This needs to be investigated but I would prefer if we have a new bug for it. 

The hang in step 5 is expected since the guest hasn't relinquished control yet (just like a rmmod when the driver is in use)
Thanks.

Comment 10 Xu Han 2014-02-12 10:40:03 UTC
(In reply to Bandan Das from comment #9)
> Between step 5 and 8, I think the only potential issue is step 7 -
> 7. reboot guest
> -- While guest booting, these messages below would be noticed.
> qemu-kvm: vfio: hot reset info failed: Operation not permitted
> qemu-kvm: vfio: hot reset info failed: Operation not permitted
> [ 1455.284560] vfio_pci_disable: Failed to reset device 0000:05:10.0 (-11)
> 
> This needs to be investigated but I would prefer if we have a new bug for
> it. 
> 
> The hang in step 5 is expected since the guest hasn't relinquished control
> yet (just like a rmmod when the driver is in use)
> Thanks.

Agreed. Thanks.

Comment 11 Bandan Das 2014-02-19 17:04:49 UTC
Closing this since QE verified fix in recent kernels.

Comment 12 Bandan Das 2014-02-20 19:30:06 UTC
Reopening.. Just got informed QE needs to close this.

Comment 14 Xu Han 2014-03-13 10:50:53 UTC
Verify this bug with component:
kernel-3.10.0-109.el7.x86_64

Steps:
1. Check VF.
# lspci | grep Emulex
07:00.0 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
07:00.1 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
07:04.0 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01) <- VF
07:04.1 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01) <- VF

2. Boot guest with assigned VF.
# /usr/libexec/qemu-kvm -M pc-i440fx-rhel7.0.0 -cpu SandyBridge -m 4G -S -smp 4,threads=1,cores=4,sockets=1 -enable-kvm -name RHEL-Server-7.0-64 -uuid cca1433d-5bac-490f-a097-c5c80c1a083f -nodefconfig -nodefaults -k en-us -rtc base=utc,clock=host,driftfix=slew -qmp tcp:0:5000,server,nowait -boot order=c,menu=on -vga qxl -global qxl-vga.vram_size=67108864 -spice port=6000,disable-ticketing -device virtio-scsi-pci,id=scsi0 -drive file=/var/lib/libvirt/images/r7.img,if=none,id=drive-scsi0-0-0,cache=none,aio=native,rerror=stop,werror=stop -device scsi-hd,drive=drive-scsi0-0-0,id=os-disk,bus=scsi0.0,bootindex=1 -netdev tap,id=tap0,vhost=on,script=/etc/qemu-ifup,queues=2 -device virtio-net-pci,netdev=tap0,mac=54:d3:89:1c:a0:7d,id=net0,vectors=6,mq=on \
-device vfio-pci,host=07:04.0,id=vf0 \
-monitor stdio

3. Unbind its parent PF.
# echo "0000:07:00.0" > /sys/bus/pci/devices/0000\:07\:00.0/driver/unbind

4. Hot unplug VF via QMP.
{ 'execute' : 'device_del', 'arguments' : { 'id' : 'vf0' } }

Results:
After step 3, the process would hung.
After step 4, VF hot unplugged successfully, and the process of step 3 returned.
# lspci | grep Emulex
07:00.0 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
07:00.1 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)


Base on these test results above, this bug has been fixed.

Comment 16 Ludek Smid 2014-06-13 12:42:44 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.