Bug 520572
Summary: | SR-IOV -- Guest exit and host hang on if boot VM with 8 VFs assigned | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Yolkfull Chow <yzhou> |
Component: | kvm | Assignee: | Don Dutile (Red Hat) <ddutile> |
Status: | CLOSED ERRATA | QA Contact: | Lawrence Lim <llim> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 5.4 | CC: | cpelland, ehabkost, juzhang, ndai, qzhang, tburke, tools-bugs, virt-maint, ykaul |
Target Milestone: | rc | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kvm-83-165.el5 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-01-13 23:11:46 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 579862, 579863 |
Description
Yolkfull Chow
2009-09-01 07:44:06 UTC
Sometimes, we could find TX unit hang in host dmesg when booting guest with 8 VFs: ... kvm: exhaust allocatable IRQ sources! kvm: exhaust allocatable IRQ sources! NETDEV WATCHDOG: eth0: transmit timed out igb 0000:28:00.0: Detected Tx Unit Hang Tx Queue <0> TDH <a3> TDT <8d> next_to_use <8d> next_to_clean <a3> buffer_info[next_to_clean] time_stamp <10029ad8b> next_to_watch <a3> jiffies <10029e06b> desc.status <a8000> breth0: port 1(eth0) entering disabled state igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX breth0: topology change detected, propagating breth0: port 1(eth0) entering forwarding state NETDEV WATCHDOG: eth0: transmit timed out igb 0000:28:00.0: Detected Tx Unit Hang Tx Queue <0> TDH <d7> TDT <c1> next_to_use <c1> next_to_clean <d7> buffer_info[next_to_clean] time_stamp <1002a263f> next_to_watch <d7> jiffies <1002a4e33> desc.status <a8000> breth0: port 1(eth0) entering disabled state igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX breth0: topology change detected, propagating breth0: port 1(eth0) entering forwarding state ... Bug has been reported for a while, could you please retest and find out if the problem still exist?? Just re-tested this problem on 83-159, guest did not hang this time but we can still find following error messages: kvm: exhaust allocatable IRQ sources! kvm: exhaust allocatable IRQ sources! Backported the patch: "KVM: fix irq_source_id size verification" . You can pull the rpm's from here: http://people.redhat.com/~ddutile/rhel5/bz520572/ Please install these rpm's and test, and let me know if it fixes the problem. Additional note: test kvm rpm's built against -191 kernel. If you can pull & test with -191, that'd be optimal. One of the latter ones (past say, -186) ought to do as well. - Don Just tested the patch, don't work and even bad that guest is hang during booting up: # rpm -qa |grep kvm kvm-qemu-img-83-161.el5bz520572v1 etherboot-zroms-kvm-5.4.4-13.el5 kvm-83-161.el5bz520572v1 kvm-debuginfo-83-161.el5bz520572v1 kvm-tools-83-161.el5bz520572v1 kmod-kvm-83-161.el5bz520572v1 [root@virtlab-66-84-58 ~]# uname -a Linux virtlab-66-84-58.englab.nay.redhat.com 2.6.18-191.el5 #1 SMP Mon Mar 1 15:59:02 EST 2010 x86_64 x86_64 x86_64 GNU/Linux Command: #qemu-kvm -drive file=/tmp/kvm_autotest_root/images/RHEL-Server-5.5-32.qcow2,if=ide,boot=on -m 512 -smp 1 -vnc :0 -pcidevice host=42:10.0 -pcidevice host=42:10.1 -pcidevice host=42:10.2 -pcidevice host=42:10.3 -pcidevice host=42:10.4 -pcidevice host=42:10.5 -pcidevice host=42:10.6 -pcidevice host=42:10.7 Please attach the following logs for both failing cases: (a) /var/log/message (dmesg) (on host) (b) /var/log/libvirt/qemu/<guest-name>.log (on host) If possible, running libvirt with debug on: export LIBVIRT_DEBUG=1 export LIBVIRT_LOG_OUTPUT="1:file:<dir/filename>" and posting the libvirt output. btw -- please try with virsh commands & xml for guest, vs qemu-kvm cmdline directly, so libvirt can capture the qemu log. Hi Don, It's strange that I re-tested this problem based on the patched RPMs, the guest worked fine and cannot find extra error message in dmesg except: BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) I also tested on 83-161 RPMs which are not patched, can find error messages: kvm: exhaust allocatable IRQ sources! assigned_dev_enable_msix: assign irq: Bad address Thus we can say the patch fixed the problem. Don't know what I misoperated last time... Maybe you forgot to re-boot before re-testing? installing the rpm won't affect the currently running kernel unless you rmmod the kvm modules, then modprobe the new ones back in. anyhow, glad to see the second test effort showed positive results. I'll dig into the kvm_destroy_phys_mem message to see if it truly indicates a possible bug or an unexpected code path for dev assignment. Will post patch shortly, and ask if it should be a candidate for 5.5-z. - Don No, I had `modprobe -r` the old kvm modules and loaded the new kvm module, checked good before testing. Weird what's wrong... Anyway, let's ignore the first testing results. :) When I verify the bug I meet a blocker. Host kernel panic when boot with "intel_iommu=on" in the kernel line. see: https://bugzilla.redhat.com/show_bug.cgi?id=580425 Can reproduce the issue on kvm-83-164.el5 1.In qemu monitor, displays: "assigned_dev_enable_msix: assign irq: Bad address" 2.In the dmesg of guest, there are: "kvm: exhaust allocatable IRQ sources!" Re-test in kvm-83-165.el5 and kvm-83-169.el5, kernel: 2.6.18-194.el5.this issue does not exist. Command line: /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -smp 2 -m 2G -drive file=RHEL5.5-Server-64.qcow2,media=disk,if=ide,cache=off -net none -vnc :10 -monitor stdio -cpu qemu64,+sse2 -pcidevice host=03:10.0 -pcidevice host=03:10.1 -pcidevice host=03:10.2 -pcidevice host=03:10.3 -pcidevice host=03:10.4 -pcidevice host=03:10.5 -pcidevice host=03:10.6 -pcidevice host=03:10.7 Steps: 1.rmmod igb modprobe igb max_vfs=7 2.bind 8 VFs to pci-stub driver. 3.boot a guest with above command line. 4.check dmesg of guest and host,and also check if there are error message in qemu monitor. Verified on kvm-83-206.el5, /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -m 4G -smp 2 -monitor stdio -drive file=/root/zhangjunyi/rhel5.6ide.raw,if=ide,boot=on,werror=stop,format=raw -net nic,vlan=0,macaddr=22:11:22:45:66:83,model=e1000 -net tap,vlan=0,script=/etc/qemu-ifup -uuid `uuidgen` -cpu qemu64,+sse2 -balloon none -boot c -vnc :10 -notify all -boot c -pcidevice host=03:10.0 -pcidevice host=03:10.1 -pcidevice host=03:10.2 -pcidevice host=03:10.3 -pcidevice host=03:10.4 -pcidevice host=03:10.5 -pcidevice host=03:10.6 -pcidevice host=03:10.7 QEMU 0.9.1 monitor - type 'help' for more information (qemu) BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) Guest process works well and host works well and vf works well in guest. .however,emit lots of "BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)" in qemu monitor.I have filed a bug Bug 645322 - Emit message "BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)" when boot guest with vf/pf attached. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0028.html |