Bug 986861 - guest hung in boot process with hot plug/unplug vf many times
guest hung in boot process with hot plug/unplug vf many times
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.5
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Alex Williamson
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-22 05:27 EDT by mazhang
Modified: 2016-09-20 00:39 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-01-20 11:52:13 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description mazhang 2013-07-22 05:27:34 EDT
Description of problem:
boot up guest without vf ,hot plug/unplug vf by script and set up guest reboot interval one minute, after two days test guest hung in boot process.

Version-Release number of selected component (if applicable):

host configuration:
qemu-kvm-0.12.1.2-2.378.el6.x86_64
2.6.32-398.el6.x86_64

guest:
RHEL6.5-20130712.n.0
2.6.32-398.el6.x86_64


How reproducible:
once

Steps to Reproduce:
1.Unbind a physical nic device from host (VF)
#echo "8086 10ed" >/sys/bus/pci/drivers/pci-stub/new_id 
#echo 0000:06:10.0 >/sys/bus/pci/devices/0000\:06\:10.2/driver/unbind 
#echo 0000:06:10.0 >/sys/bus/pci/drivers/pci-stub/bind

2.boot up guest
#/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-m 2G \
-smp 2,sockets=1,cores=2,threads=1,maxcpus=16 \
-enable-kvm \
-name rhel6u5 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-rtc base=localtime,clock=host,driftfix=slew \
-nodefaults \
-monitor stdio \
-qmp tcp:0:6666,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-vga qxl \
-spice port=5900,disable-ticketing \
-global PIIX4_PM.disable_s3=0 \
-global PIIX4_PM.disable_s4=0 \
-monitor unix:/tmp/monitor-unix,nowait,server \
-drive file=/home/rhel6u5.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0 \

3.set up reboot interval
#crontab -l
*/1 * * * * reboot

4.hot plug/unplug by script
#!/bin/bash

while [ true ]
do
sleep 2
echo "device_add pci-assign,host=06:10.0,id=hostnet" | nc -U /tmp/monitor-unix
sleep 2
echo "device_del hostnet" |nc -U /tmp/monitor-unix

done



Actual results:
after about 2 days guest hung in boot process
1.qemu prompt vm "running"
(qemu) info status 
VM status: running

2.qemu monitor log:
(qemu) assigned_dev_enable_msix: assign irq: Invalid argument
fail to set MSI-X entry number for MSIX! Invalid argument
assigned_dev_update_msix_mmio: Invalid argument
assigned_dev_enable_msix: assign irq: Device or resource busy
assigned_dev_enable_msix: assign irq: Device or resource busy
assigned_dev_enable_msix: assign irq: Invalid argument
assigned_dev_enable_msix: assign irq: Invalid argument
fail to set MSI-X entry number for MSIX! Invalid argument
assigned_dev_update_msix_mmio: Invalid argument
assigned_dev_enable_msix: assign irq: Invalid argument
fail to set MSI-X entry number for MSIX! Invalid argument
assigned_dev_update_msix_mmio: Invalid argument
assigned_dev_enable_msix: assign irq: Device or resource busy
assigned_dev_enable_msix: assign irq: Device or resource busy
assigned_dev_enable_msix: assign irq: Invalid argument
fail to set MSI-X entry number for MSIX! Invalid argument
assigned_dev_update_msix_mmio: Invalid argument
assigned_dev_enable_msix: assign irq: Invalid argument
fail to set MSI-X entry number for MSIX! Invalid argument
assigned_dev_update_msix_mmio: Invalid argument
assigned_dev_enable_msix: assign irq: Invalid argument

3.guest did not generate dump file.

Expected results:
guest work well

Additional info:
just happened once , I'm keep trying reproduce this problem.
Comment 2 RHEL Product and Program Management 2013-10-13 23:04:43 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 3 Alex Williamson 2014-07-22 17:28:54 EDT
Why is this a valid test?  Hotplug is an interactive process between QEMU and guest.  The indicated test scripts do nothing to ensure that the guest has acknowledge and completed the previous step before moving to the next step.  It is guaranteed that the guest and test script will be out of sync during various points.  The only only surprising thing here IMHO is that it took 2 days to fail.
Comment 4 mazhang 2014-07-23 04:52:26 EDT
I'm trying test this scenario use autotest on latest kernel and qemu-kvm, will update the result after test finish.

Thanks,
Mazhang.
Comment 5 mazhang 2014-07-23 22:13:39 EDT
The test case should be repeat hotplug/unplug vf 500 times, no reboot guest.
Also test hotplug/unplug 500 times on latest kernel and qemu-kvm, guest and qemu-kvm works well.

It should be a invalid case, please feel free close this bug as "NOTABUG".

Thanks,
Mazhang.
Comment 7 Alex Williamson 2015-01-20 11:52:13 EST
The problem mentioned in comment 6 has been root caused to be a VF issue specific to that particular PF driver, so not related to this test case.  As in comment 3, I believe this to be an invalid test case.  If we want to do such combined stress testing as doing asynchronous hotplug at the same time as reboot testing, the scripts should at least test whether the previous step was acknowledged and completed by the guest.  I'd also suggest that RHEL7 is a better target for that kind of advanced hardening testing at this point.

Note You need to log in before you can comment on or make changes to this bug.