| Summary: | RFE: support <hostdev> <rom bar='on|off'/> | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Community] Virtualization Tools | Reporter: | Stefan Assmann <sassmann> | ||||||||
| Component: | virt-manager | Assignee: | Cole Robinson <crobinso> | ||||||||
| Status: | CLOSED UPSTREAM | QA Contact: | |||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | unspecified | CC: | agospoda, berrange, crobinso, dallan, ddutile, xen-maint | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | All | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2014-02-10 19:26:03 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Attachments: |
|
||||||||||
what versions of qemu-kvm & libvirt did you test with ? qemu-kvm-0.12.1.2-2.160.el6.x86_64 libvirt-0.8.7-18.el6.x86_64 After starting the guest and waiting a while I found this in syslog kvm: 2550: cpu0 guest string pio down Also the guest seems to be paused virsh list --all Id Name State ---------------------------------- 1 rhel5-64-kvm paused I've manually upgraded the following packages to the 6.2 versions glibc-2.12-1.47.el6.x86_64.rpm glibc-common-2.12-1.47.el6.x86_64.rpm glibc-devel-2.12-1.47.el6.x86_64.rpm glibc-headers-2.12-1.47.el6.x86_64.rpm libvirt-0.9.4-23.el6.x86_64.rpm libvirt-client-0.9.4-23.el6.x86_64.rpm libvirt-python-0.9.4-23.el6.x86_64.rpm netcf-libs-0.1.9-2.el6.x86_64.rpm qemu-img-0.12.1.2-2.209.el6.x86_64.rpm qemu-kvm-0.12.1.2-2.209.el6.x86_64.rpm sgabios-bin-0-0.3.20110621svn.el6.noarch.rpm spice-server-0.8.2-5.el6.x86_64.rpm Still the same, guest does not start. Also when the guest switches from running to paused I sometimes get single mode not supported single mode not supported level sensitive irq not supported level sensitive irq not supported in dmesg I logged into the test machine; 04:00.[0,1] are the 82580's, not the 82576; the latter are 02:00.[0,1]. root.bos.redhat.com:~> lspci | grep Ethernet 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20) 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit Ethernet (rev 20) 02:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 02:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 04:00.0 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01) 04:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01) So, if the description is correct, then 82576 do work, but 82580's do not. I don't have experience with 82580's, but the problem could be reset-related. To check, do a loop test that unbind's the driver to the 82580, resets it, (do these steps by echo-ing to appropriate, device-specific sysfs files under /sys/bus/pci/devices/<BDF>/[driver/unbind,reset] ) and then re-bind the driver back to the device. if the device wedges after a number of these loops, then the device has a reset/re-config issue (which is typical bug found while doing a device-assignment). Created attachment 550878 [details]
pci-unbind-reset-bind.sh
Sorry Don my mistake. This problem here is the 82576 NIC not the 82580.
I tried what you suggested and did a unbind/reset/bind loop with 1000 iterations and that worked flawless.
Attaching the script I used.
To avoid any further misunderstanding the problem occurs with the following device 0000:02:00.0.
Very weird, after the whole unbind/reset/bind looping passthrough of both NICs seems to work. I tried several times now, including a reboot of the machine. Don, should we close this and see if it happens again? I'd like to know why it didn't work before but I guess there's not much we can do now. (In reply to comment #9) > Very weird, after the whole unbind/reset/bind looping passthrough of both NICs > seems to work. I tried several times now, including a reboot of the machine. > > Don, should we close this and see if it happens again? I'd like to know why it > didn't work before but I guess there's not much we can do now. What does your kernel boot cmdline look like? attach full boot-up dmesg log. Created attachment 551533 [details]
dmesg.txt
root.bos.redhat.com:~> cat /proc/cmdline
ro root=/dev/mapper/vg_dellpet41004-lv_root rd_LVM_LV=vg_dellpet41004/lv_root rd_LVM_LV=vg_dellpet41004/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=ttyS0,115200n81 ignore_loglevel no_console_suspend intel_iommu=on crashkernel=128M
I just tried to passthrough the 82576 to a guest and the guest first seemed stuck again but then started after 1-2 minutes... Don, I'll ping you so we can have a look at this together. In /var/log/libvirt/libvirtd.log I found 06:27:04.486: 2044: error : daemonStreamEvent:208 : stream had I/O failure 06:27:04.486: 2044: error : virFDStreamUpdateCallback:111 : internal error stream is not open 06:27:05.437: 2044: error : qemuMonitorIO:583 : internal error End of file from monitor 06:35:45.510: 2044: error : daemonStreamHandleAbort:590 : stream aborted at client request 06:36:51.415: 2044: error : daemonStreamHandleAbort:590 : stream aborted at client request 06:47:13.709: 2044: error : daemonStreamEvent:208 : stream had I/O failure 06:47:13.710: 2044: error : virFDStreamUpdateCallback:111 : internal error stream is not open 06:47:14.775: 2044: error : qemuMonitorIO:583 : internal error End of file from monitor (In reply to comment #12) > I just tried to passthrough the 82576 to a guest and the guest first seemed > stuck again but then started after 1-2 minutes... Don, I'll ping you so we can > have a look at this together. > > In /var/log/libvirt/libvirtd.log I found > 06:27:04.486: 2044: error : daemonStreamEvent:208 : stream had I/O failure > 06:27:04.486: 2044: error : virFDStreamUpdateCallback:111 : internal error > stream is not open > 06:27:05.437: 2044: error : qemuMonitorIO:583 : internal error End of file from > monitor > 06:35:45.510: 2044: error : daemonStreamHandleAbort:590 : stream aborted at > client request > 06:36:51.415: 2044: error : daemonStreamHandleAbort:590 : stream aborted at > client request > 06:47:13.709: 2044: error : daemonStreamEvent:208 : stream had I/O failure > 06:47:13.710: 2044: error : virFDStreamUpdateCallback:111 : internal error > stream is not open > 06:47:14.775: 2044: error : qemuMonitorIO:583 : internal error End of file from > monitor With a device assigned to the guest, the guest is trying to do pxe boot (can see this if you use virt-manager on host); without the assigned device, no PXEboot -- I don't understand why that happens. The 1->2 minute delay you are seeing is pxeboot waiting for a selection, and then times out to do a local boot. I think I know what's going on now. The 82576 NIC has an option ROM for PXE boot. Now when the device is passed to the guest libvirt or kvm (not sure which) see this option ROM and decide to boot from it! The idea is not so bad but I don't see any option on virt-manager to disable this behaviour. However I found the <rom bar='off'/> tag that I added to the xml manually and now the option ROM just gets ignored and everything works as expected. I would suggest to add an option in virt-manager to disable any PCI option ROM detected for PCI devices that are passed to the guest. Created attachment 555077 [details]
virt-manager.jpg
This might be a good place to add the "disable PCI option ROM" option.
Sounds like we should change the component to virt-manager. Upstream now:
commit 82754ddc84041ece7a8462e1b14860eda4ea022b
Author: Cole Robinson <crobinso>
Date: Mon Feb 10 14:24:22 2014 -0500
Expose hostdev rombar in UI and cli (bz 768857)
|
Description of problem: dell-pet410-04.lab.bos.redhat.com has 2 Intel NICs (1x82576, 1x82580). When I assign the 82576 NIC to a kvm guest in ~80% of the time the guest does not boot at all. Nothing is observed on the guests serial console. However assigning the 82580 NIC works all the time. Version-Release number of selected component (if applicable): kernel-2.6.32-131.0.15.el6.x86_64 libvirt-0.8.7-18.el6.x86_64 How reproducible: often Steps to Reproduce: dell-pet410-04.lab.bos.redhat.com 1. assign NIC to kvm guest via xml <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </source> </hostdev> 2. virsh start guest Actual results: guest often does not boot Expected results: guest boots all the time Additional info: