RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 925170 - MSI routing for 1553 card to guest stops working
Summary: MSI routing for 1553 card to guest stops working
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.3
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Alex Williamson
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-23 00:13 UTC by wrob0123
Modified: 2018-12-01 16:04 UTC (History)
14 users (show)

Fixed In Version: qemu-kvm-0.12.1.2-2.367.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-11-21 06:46:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
lspci -vv output (53.42 KB, text/plain)
2013-03-23 00:13 UTC, wrob0123
no flags Details
lspci tree information (4.23 KB, text/plain)
2013-03-30 02:28 UTC, wrob0123
no flags Details
qemu log entry from /var/log/libvirt/qemu (1.81 KB, text/plain)
2013-04-01 17:00 UTC, wrob0123
no flags Details
dmesg including IRQ 16 burp (53.88 KB, text/plain)
2013-04-17 23:34 UTC, wrob0123
no flags Details
/proc/interrupts while guest is running the app (5.28 KB, text/plain)
2013-04-17 23:36 UTC, wrob0123
no flags Details
lspci -vvv output for new system (61.95 KB, text/plain)
2013-04-17 23:37 UTC, wrob0123
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:1553 0 normal SHIPPED_LIVE Important: qemu-kvm security, bug fix, and enhancement update 2013-11-20 21:40:29 UTC

Description wrob0123 2013-03-23 00:13:34 UTC
Created attachment 714855 [details]
lspci -vv output

Description of problem:
Running windows 7 (32bit) guest VM with 1553 interface (PCIe) card assigned, certain applications rely on interrupt driven interaction with the 1553 card. An older version of the card uses INTx and works fine, but with the newer card which uses MSI interrupts, the application only gets a few updates of interface data before the interrupts stop occurring. I suspect that there is a problem with kvm and/or qemu dealing with a high rate of MSI interrupts, and would like some suggestions about how to troubleshoot to narrow down the problem.

Version-Release number of selected component (if applicable):
Linux bubba 2.6.32-279.el6.x86_64 #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64
qemu-kvm-0.12.1.2-2.295.el6.x86_64
libvirt-0.9.10-21.el6.x86_64

How reproducible:
Sometimes get just a few interrupts, sometimes get thousands of interrupts before stalling out, but it never runs more than a few seconds.

Steps to Reproduce:
1. install win7 guest (with libvirt/VMM) and assign the 1553 card
2. start 1553 application on guest which needs interrupts
3. wait a second for the application to stop getting updates
4. check /proc/interrupts to verify that interrupt counter stopped
  
Actual results:
The counter in the guest's application and the /proc/interrupts counts for CPU0 stop updating

Expected results:
It should work like the same setup on a physical machine where the application continues to get updates from the 1553 interface. With an older version of the 1553 card which uses INTx instead of MSI, everything works good on the VM guest.

Additional info:
see lspci -vv output in attachment

Comment 1 wrob0123 2013-03-23 01:29:54 UTC
Tried the kernel boot option pci=nomsi with no luck - unable to start guest.
Got error in qemu log when trying to start the guest with (libvirt) VMM:

qemu-kvm: -device pci-assign,host=24:00.0,id=hostdev0,configfd=25,bus=pci.0,addr=0x7: Device 'pci-assign' could not be initialized: Failed to assign irq: Invalid argument
Failed to assign irq for "hostdev0": Invalid argument
Perhaps you are assigning a device that shares an IRQ with another device?
qemu-kvm: -device pci-assign,host=24:00.0,id=hostdev0,configfd=25,bus=pci.0,addr=0x7: Device 'pci-assign' could not be initialized

The card is using IRQ 54 - not shared by another device

Comment 2 wrob0123 2013-03-23 03:07:19 UTC
Tried the kernel boot option intremap=off and it does not make a difference

Something that bothers me in the dmesg output - why does irq 80 get assigned multiple times to the card by the pci-stub driver?

This is showing up in pairs:
pci-stub 0000:24:00.0: irq 80 for MSI/MSI-X
pci-stub 0000:24:00.0: irq 80 for MSI/MSI-X

I thought there it would be like irq 80 and irq 81 for separate vectors

Comment 3 wrob0123 2013-03-23 04:11:59 UTC
Tried disabling irqbalance - no effect

Tried kernel boot option noapic - no effect

Should I try kernel option pci=noacpi also?  Or acpi=off ?

Comment 4 wrob0123 2013-03-30 02:28:18 UTC
Created attachment 718210 [details]
lspci tree information

hand generated lspci tree may be helpful

Comment 5 juzhang 2013-04-01 09:03:54 UTC
(In reply to comment #0)
> Created attachment 714855 [details]
> lspci -vv output
> 
> Description of problem:
> Running windows 7 (32bit) guest VM with 1553 interface (PCIe) card assigned,
> certain applications rely on interrupt driven interaction with the 1553
> card. An older version of the card uses INTx and works fine, but with the
> newer card which uses MSI interrupts, the application only gets a few
> updates of interface data before the interrupts stop occurring. I suspect
> that there is a problem with kvm and/or qemu dealing with a high rate of MSI
> interrupts, and would like some suggestions about how to troubleshoot to
> narrow down the problem.
> 
> Version-Release number of selected component (if applicable):
> Linux bubba 2.6.32-279.el6.x86_64 #1 SMP Fri Jun 22 12:19:21 UTC 2012 x86_64
> qemu-kvm-0.12.1.2-2.295.el6.x86_64
> libvirt-0.9.10-21.el6.x86_64
> 
> How reproducible:
> Sometimes get just a few interrupts, sometimes get thousands of interrupts
> before stalling out, but it never runs more than a few seconds.
> 
> Steps to Reproduce:
> 1. install win7 guest (with libvirt/VMM) and assign the 1553 card
Would you please tell me how to assign the 1553 card to guest? 
> 2. start 1553 application on guest which needs interrupts
> 3. wait a second for the application to stop getting updates
> 4. check /proc/interrupts to verify that interrupt counter stopped
>   
> Actual results:
> The counter in the guest's application and the /proc/interrupts counts for
> CPU0 stop updating
> 
> Expected results:
> It should work like the same setup on a physical machine where the
> application continues to get updates from the 1553 interface. With an older
> version of the 1553 card which uses INTx instead of MSI, everything works
> good on the VM guest.
> 
> Additional info:
> see lspci -vv output in attachment

Comment 6 wrob0123 2013-04-01 16:59:23 UTC
> > 1. install win7 guest (with libvirt/VMM) and assign the 1553 card
> Would you please tell me how to assign the 1553 card to guest?

Using Virtual Machine Manager gui while the guest is not running, go to VM's "Details" view, click "Add Hardware" then "PCI Host Device" and then "24:00:0"

I will attach log file entry from /var/log/libvirt/qemu

Comment 7 wrob0123 2013-04-01 17:00:30 UTC
Created attachment 730343 [details]
qemu log entry from /var/log/libvirt/qemu

Comment 10 Alex Williamson 2013-04-09 19:55:28 UTC
(In reply to comment #2)
> Tried the kernel boot option intremap=off and it does not make a difference
> 
> Something that bothers me in the dmesg output - why does irq 80 get assigned
> multiple times to the card by the pci-stub driver?
> 
> This is showing up in pairs:
> pci-stub 0000:24:00.0: irq 80 for MSI/MSI-X
> pci-stub 0000:24:00.0: irq 80 for MSI/MSI-X
> 
> I thought there it would be like irq 80 and irq 81 for separate vectors

The device only supports a single MSI vector, this happens when MSI is enable, disabled, then re-enabled.  There's just no printout in the kernel when it was disabled.

Comment 12 Alex Williamson 2013-04-09 20:03:52 UTC
Please test with the latest release (6.4) and look at /proc/interrupts on the host to determine if the device is continuing to send interrupts.  Does restarting the guest VM (ie. reboot the guest within the same VM session) allow the card to work again briefly?  MSI is used by some network cards and is known to work at high interrupt rates there.

Comment 13 Alex Williamson 2013-04-09 20:10:13 UTC
It also occurs to me that 32bit Windows may not actually enable MSI support and KVM device assignment is using a hybrid mode where INTx is emulated to the guest using MSI on the physical device.  Please check the resource tab for the device in device manager and check whether the guest sees the device using MSI or INTx.  You may have better results with 64bit Windows.

Comment 14 wrob0123 2013-04-10 02:27:38 UTC
(In reply to comment #12)
> Please test with the latest release (6.4) and look at /proc/interrupts on
> the host to determine if the device is continuing to send interrupts.  Does
> restarting the guest VM (ie. reboot the guest within the same VM session)
> allow the card to work again briefly?  MSI is used by some network cards and
> is known to work at high interrupt rates there.

Will not have a 6.4 host available until next week. With current 6.3 host, restarting win7 guest (within same VM session, or new session) allows the card to work again briefly - so does disabling and reenabling device driver in guest.

Where MSI is successful for some network cards, is there a host driver (other than pci-stub) involved with setting up the pci config?

Comment 15 wrob0123 2013-04-10 02:43:12 UTC
(In reply to comment #13)
> It also occurs to me that 32bit Windows may not actually enable MSI support
> and KVM device assignment is using a hybrid mode where INTx is emulated to
> the guest using MSI on the physical device.  Please check the resource tab
> for the device in device manager and check whether the guest sees the device
> using MSI or INTx.  You may have better results with 64bit Windows.

To the 32bit win7 guest, the device looks like it is using INTx interrupt line 10 or IRQ 0x0A. Perhaps this has to do with the way libvirt created the guest, and I do not understand how a 64bit windows machine would fare any better. How can I tell kvm and/or qemu to not use MSI interrupts for this device?

Comment 16 Alex Williamson 2013-04-10 20:02:56 UTC
(In reply to comment #14)
> Will not have a 6.4 host available until next week. With current 6.3 host,
> restarting win7 guest (within same VM session, or new session) allows the
> card to work again briefly - so does disabling and reenabling device driver
> in guest.

So possibly the device and driver get out of sync whether an interrupt is pending.

> Where MSI is successful for some network cards, is there a host driver
> (other than pci-stub) involved with setting up the pci config?

No

(In reply to comment #15)
> To the 32bit win7 guest, the device looks like it is using INTx interrupt
> line 10 or IRQ 0x0A. Perhaps this has to do with the way libvirt created the
> guest, and I do not understand how a 64bit windows machine would fare any
> better. How can I tell kvm and/or qemu to not use MSI interrupts for this
> device?

There's currently no way to manipulate the interrupt in RHEL qemu-kvm.  The difference with 64bit windows is that it will use MSI when available which would avoid a hybrid mode where the device uses MSI, but qemu-kvm emulates INTx to the guest.  MSI in both host and guest is preferred due to the overhead of INTx.

Comment 18 wrob0123 2013-04-12 05:49:02 UTC
(In reply to comment #16)
> > restarting win7 guest (within same VM session, or new session) allows the
> > card to work again briefly - so does disabling/reenabling guest driver
> So possibly the device and driver get out of sync when interrupt is pending

Stole a better machine today and set it up with RHEL 6.4 on host, and 64bit win7 on guest with 8GB RAM, 4 CPUs and the 1553 card assigned. Got more interrupts to the guest driver before it stalled out, but basically looks like the same problem. Sounds like we agree that interrupts are getting lost. I would love to hear about how to troubleshoot kvm/qemu/libvirt to find out where they are getting lost. Thanks for any help (or advice) you are willing to provide. From what I can tell, it sounds like I will have to either install a development version of kernel (with VFIO) and the recommended corresponding versions of qemu and libvirt, or just wait for RHEL 7.

> > To the 32bit win7 guest, the device looks like it is using INTx interrupt
> There's currently no way to manipulate the interrupt in RHEL qemu-kvm

On the 64bit win7 guest, device is using INT 5 and MSI is disabled

> ... 64bit windows ... will use MSI when available 

Is this a problem with the driver provided by DDC, or is qemu-kvm not presenting the resource to the guest as an MSI capable device based on the way the I used Virtual Machine Manager (and libvirt) to create the machine?

> ... MSI in both host and guest is preferred

OK - but I need a workaround until something is fixed
If I can figure out where the interrupts are getting lost ...
Are you saying that the hybrid mode is broken?

Other notes of mods to allow pci passthrough:
1. In /etc/grub.conf, kernel line, added “intel_iommu=on” 
2. Put this line into /etc/modprobe.d/kvm.conf :
   options kvm allow_unsafe_assigned_interrupts=1
3. Edited /etc/libvirt/qemu.conf to change a couple of lines:
   relaxed_acs_check = 1
   clear_emulator_capabilities = 0

Possibly one of the mods above are bogus?

Here are some interesting messages from dmesg:
kvm: 3309: cpu0 unhandled rdmsr: 0x2d
kvm: 3309: cpu0 unhandled rdmsr: 0x35
kvm: 3309: cpu0 unhandled rdmsr: 0x3a
kvm: 3309: cpu0 unhandled rdmsr: 0xce
kvm: 3309: cpu0 unhandled rdmsr: 0xee
kvm: 3309: cpu0 unhandled rdmsr: 0x199
kvm: 3309: cpu0 unhandled rdmsr: 0x19c
kvm: 3309: cpu0 unhandled rdmsr: 0x1a1
kvm: 3309: cpu0 unhandled rdmsr: 0x1a2
kvm: 3309: cpu0 unhandled rdmsr: 0x1ac
__ratelimit: 43 callbacks suppressed
kvm: 3309: cpu0 unhandled rdmsr: 0x2d
kvm: 3309: cpu0 unhandled rdmsr: 0x35
kvm: 3309: cpu0 unhandled rdmsr: 0x3a
kvm: 3309: cpu0 unhandled rdmsr: 0xce
kvm: 3309: cpu0 unhandled rdmsr: 0xee
kvm: 3309: cpu0 unhandled rdmsr: 0x199
kvm: 3309: cpu0 unhandled rdmsr: 0x19c
kvm: 3309: cpu0 unhandled rdmsr: 0x1a1
kvm: 3309: cpu0 unhandled rdmsr: 0x1a2
kvm: 3309: cpu0 unhandled rdmsr: 0x1ac

Comment 19 wrob0123 2013-04-12 06:47:39 UTC
Tried taking out the kvm option allow_unsafe_assigned_interrupts=1
dmesg showed this error:
kvm_iommu_map_guest: No interrupt remapping support ...

Should I try the kernel option nointremap ?

Comment 22 Alex Williamson 2013-04-16 20:29:59 UTC
(In reply to comment #18)
> (In reply to comment #16)
> > > restarting win7 guest (within same VM session, or new session) allows the
> > > card to work again briefly - so does disabling/reenabling guest driver
> > So possibly the device and driver get out of sync when interrupt is pending
> 
> Stole a better machine today and set it up with RHEL 6.4 on host, and 64bit
> win7 on guest with 8GB RAM, 4 CPUs and the 1553 card assigned. Got more
> interrupts to the guest driver before it stalled out, but basically looks
> like the same problem. Sounds like we agree that interrupts are getting
> lost. I would love to hear about how to troubleshoot kvm/qemu/libvirt to
> find out where they are getting lost. Thanks for any help (or advice) you
> are willing to provide. From what I can tell, it sounds like I will have to
> either install a development version of kernel (with VFIO) and the
> recommended corresponding versions of qemu and libvirt, or just wait for
> RHEL 7.

Look in /proc/interrupts on the host, find the kvm entry and check whether it's reported as MSI or APIC and note whether the interrupt count continues to increase after the device stops working in the guest.

> > > To the 32bit win7 guest, the device looks like it is using INTx interrupt
> > There's currently no way to manipulate the interrupt in RHEL qemu-kvm
> 
> On the 64bit win7 guest, device is using INT 5 and MSI is disabled

That's unfortunate...

> > ... 64bit windows ... will use MSI when available 
> 
> Is this a problem with the driver provided by DDC, or is qemu-kvm not
> presenting the resource to the guest as an MSI capable device based on the
> way the I used Virtual Machine Manager (and libvirt) to create the machine?

I suspect it's a problem with the DDC driver.  There's nothing you can change in virt-manager or libvirt to prevent MSI being exposed to the guest.  Most drivers will try to take advantage of MSI when available.  Not using it could just mean the driver hasn't been updated for MSI or it could me MSI support on the card is broken.

> > ... MSI in both host and guest is preferred
> 
> OK - but I need a workaround until something is fixed
> If I can figure out where the interrupts are getting lost ...
> Are you saying that the hybrid mode is broken?

It may be incompatible with this device or it may be that the device is broken and advertises MSI support, but it doesn't work.

> Other notes of mods to allow pci passthrough:
> 1. In /etc/grub.conf, kernel line, added “intel_iommu=on” 

Yes, required.

> 2. Put this line into /etc/modprobe.d/kvm.conf :
>    options kvm allow_unsafe_assigned_interrupts=1

This is only necessary if device assignment doesn't work without it, which means the host chipset doesn't support interrupt remapping and you opt-in to unsafe interrupts.

> 3. Edited /etc/libvirt/qemu.conf to change a couple of lines:
>    relaxed_acs_check = 1

Only necessary if libvirt won't otherwise allow you to assign the device.  This also means that the data path from the device to the IOMMU doesn't guarantee accesses won't be re-routed (u-turned) before reaching the IOMMU for translation.  This is generally not recommended and could cause problems.

>    clear_emulator_capabilities = 0

This should not be required.

> Possibly one of the mods above are bogus?
>
> Here are some interesting messages from dmesg:
> kvm: 3309: cpu0 unhandled rdmsr: 0x2d
> kvm: 3309: cpu0 unhandled rdmsr: 0x35
> kvm: 3309: cpu0 unhandled rdmsr: 0x3a
> kvm: 3309: cpu0 unhandled rdmsr: 0xce
> kvm: 3309: cpu0 unhandled rdmsr: 0xee
> kvm: 3309: cpu0 unhandled rdmsr: 0x199
> kvm: 3309: cpu0 unhandled rdmsr: 0x19c
> kvm: 3309: cpu0 unhandled rdmsr: 0x1a1
> kvm: 3309: cpu0 unhandled rdmsr: 0x1a2
> kvm: 3309: cpu0 unhandled rdmsr: 0x1ac
> __ratelimit: 43 callbacks suppressed
> kvm: 3309: cpu0 unhandled rdmsr: 0x2d
> kvm: 3309: cpu0 unhandled rdmsr: 0x35
> kvm: 3309: cpu0 unhandled rdmsr: 0x3a
> kvm: 3309: cpu0 unhandled rdmsr: 0xce
> kvm: 3309: cpu0 unhandled rdmsr: 0xee
> kvm: 3309: cpu0 unhandled rdmsr: 0x199
> kvm: 3309: cpu0 unhandled rdmsr: 0x19c
> kvm: 3309: cpu0 unhandled rdmsr: 0x1a1
> kvm: 3309: cpu0 unhandled rdmsr: 0x1a2
> kvm: 3309: cpu0 unhandled rdmsr: 0x1ac

These are likely probes for CPU features that aren't supported, note that the libvirt log also contains:

CPU feature pdcm not found
CPU feature smx not found
CPU feature dtes64 not found

(In reply to comment #19)
> Tried taking out the kvm option allow_unsafe_assigned_interrupts=1
> dmesg showed this error:
> kvm_iommu_map_guest: No interrupt remapping support ...
> 
> Should I try the kernel option nointremap ?

No, nointremap won't help because the hardware does not support VT-d2, which added interrupt remapping support.  This means you need to keep allow_unsafe_assigned_interrupts=1 and opt-in to the security implications to make use of device assignment on this host.

Comment 24 Alex Williamson 2013-04-17 04:31:34 UTC
Please test the qemu-kvm packages found here:

http://people.redhat.com/alwillia/bz925170/

This will not change the default behavior, but does offer a prefer_msi=on|off option which can be used to disable the hybrid msi-host/intx-guest mode.  To use this on the qemu command line, simply add "prefer_msi=off" to the list of options for the -device pci-assign entry.  To use via libvirt, follow the instructions found here:

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Technical_Notes/virt.html

Follow the "virtio network device packet transmission algorithms" section, replacing the wrapper in step 1 with this:

$ cat > /usr/libexec/qemu-kvm.prefer_intx << EOF
#!/bin/sh
exec /usr/libexec/qemu-kvm `echo "$@" | sed 's|configfd=|prefer_msi=off,configfd=|g'`
EOF

Use this new file name for steps 2, 3 and 7.  Steps 3-6 are necessary when selinux is in enforcing mode.  After starting the guest, you should be able to see with 'ps aux | grep qemu' that the prefer_msi=off option is part of the pci-assign device options and your card should use INTx both on host and guest.

You will also need to make sure the device has access to an exclusive interrupt, which may require unbinding devices sharing the interrupt from host drivers.  This can be done by finding the conflicting drivers in /sys/bus/pci/drivers.  In the appropriate directory you can "echo 0000:xx:yy.z > unbind" (where xx:yy.z is the PCI bus:slot.func for the device).  If available, you can also unload modules using "modprobe -r" to removing conflicting drivers altogether.

Does this allow the card to work?

Comment 25 wrob0123 2013-04-17 04:52:48 UTC
(In reply to comment #22)

> Look in /proc/interrupts on the host, find the kvm entry and check whether
> it's reported as MSI or APIC and note whether the interrupt count continues
> to increase after the device stops working in the guest.

PCI-MSI-edge  kvm_assigned_msi_device
The interrupt count stops at the same time the guest app stops getting updates driven by the interrupts in the DDC driver. However, I can start a polled mode application and read data from the card.

> > > ... 64bit windows ... will use MSI when available 
> > 
> > Is this a problem with the driver provided by DDC, or is qemu-kvm not
> > presenting the resource to the guest as an MSI capable device based on the
> > way the I used Virtual Machine Manager (and libvirt) to create the machine?
> 
> I suspect it's a problem with the DDC driver.  There's nothing you can
> change in virt-manager or libvirt to prevent MSI being exposed to the guest.
> Most drivers will try to take advantage of MSI when available.  Not using it
> could just mean the driver hasn't been updated for MSI or it could me MSI
> support on the card is broken.

On a physical windows machine, the driver does not seem to use MSI either.

> > > ... MSI in both host and guest is preferred
> > 
> > OK - but I need a workaround until something is fixed
> > If I can figure out where the interrupts are getting lost ...
> > Are you saying that the hybrid mode is broken?
> 
> It may be incompatible with this device or it may be that the device is
> broken and advertises MSI support, but it doesn't work.

If the hybrid mode is not compatible, or the card does not implement MSI support correctly, would that mean interrupts are getting lost in kvm?
 
Another note: This new 64bit win7 guest has more cores and memory allocated than the previous 32bit win7 guests, and it seems to run longer before the lost interrupt problem manifests. Is that a clue on where the interrupt is getting lost, or do you think it just runs better since it is 64bit?

> > 3. Edited /etc/libvirt/qemu.conf to change a couple of lines:
> >    relaxed_acs_check = 1
> >    clear_emulator_capabilities = 0
> 
> This should not be required.

I can run the guest without these mods to /etc/libvirt/qemu.conf
(probably vestiges of 6.2 workarounds on older hardware)
but still need allow_unsafe_assigned_interrupts=1

> > Should I try the kernel option nointremap ?
> 
> No, nointremap won't help because the hardware does not support VT-d2, which
> added interrupt remapping support. 

Correct, the system does not support VT-d2. Given that I am stuck with this hardware, what else can I try as a workaround? Are there any troubleshooting or debugging steps to help narrow down where the interrupt is getting lost?

DDC is reluctant to change their driver or hardware design, and assumes the interrupts are getting lost in qemu-kvm. I have opened a RH support case, and have provided more details about their card in that forum. I believe MSI support in the Linux kernel requires compliance to PCI spec 2.3, but it may be that DDC designed their MSI support to the initial version in PCI spec 2.2.

Would RHEL 7 help my situation, or should I try development versions of kernel, qemu-kvm, vfio, libvirt, etc?

Comment 26 wrob0123 2013-04-17 05:16:43 UTC
(In reply to comment #24)
> Please test the qemu-kvm packages found here:

Yes, I have read about the new prefer_msi option to pci-assign, and that is why I have been asking to try RHEL 7. Are these qemu packages compatible with my current kernel?  (2.6.32-358.el6.x86_64)

Comment 27 wrob0123 2013-04-17 06:40:34 UTC
(In reply to comment #26)
> (In reply to comment #24)
> > Please test the qemu-kvm packages found here:
> 
> Yes, I have read about the new prefer_msi option to pci-assign, and that is
> why I have been asking to try RHEL 7. Are these qemu packages compatible
> with my current kernel?  (2.6.32-358.el6.x86_64)

It appears these qemu packages work with my kernel, and do allow me to configure the host side interrupt type to IO-APIC-fasteoi  kvm_assigned_intx_device

The guest has been running the interrupt driven application on the guest now for 20 minutes, so it appears that this workaround is a success. 

Based on your warning about getting an exclusive interrupt, I assume these qemu packages do not support the share_intx option. I did get the following console message when I started the guest app:

Message from syslogd@fusion at Apr 17 01:11:15 ...
 kernel:Disabling IRQ #16

The DDC card is getting assigned IRQ 54 by pci-stub, but I have seen this type of issue before and worked around by disabling a USB port. Do I need to do this, or did the kernel automatically take care of that problem?

Comment 28 Alex Williamson 2013-04-17 15:06:17 UTC
(In reply to comment #27)
> (In reply to comment #26)
> > (In reply to comment #24)
> > > Please test the qemu-kvm packages found here:
> > 
> > Yes, I have read about the new prefer_msi option to pci-assign, and that is
> > why I have been asking to try RHEL 7.

RHEL7 is in planning and development at this point, it's not yet being shared externally.  To get a preview of what RHEL7 will contain, I'd suggest using the latest Fedora is the best option.  VFIO does not use this hybrid model and pci-assign has also moved away from it as the default.

> > Are these qemu packages compatible
> > with my current kernel?  (2.6.32-358.el6.x86_64)
> 
> It appears these qemu packages work with my kernel, and do allow me to
> configure the host side interrupt type to IO-APIC-fasteoi 
> kvm_assigned_intx_device
> 
> The guest has been running the interrupt driven application on the guest now
> for 20 minutes, so it appears that this workaround is a success. 

Great

> Based on your warning about getting an exclusive interrupt, I assume these
> qemu packages do not support the share_intx option.

Correct, share_intx is a much more intrusive change and would require both qemu and kernel updates.  I can't really recommend adding those kinds of changes to RHEL6 for such an obscure device.

> I did get the following
> console message when I started the guest app:
> 
> Message from syslogd@fusion at Apr 17 01:11:15 ...
>  kernel:Disabling IRQ #16
> 
> The DDC card is getting assigned IRQ 54 by pci-stub, but I have seen this
> type of issue before and worked around by disabling a USB port. Do I need to
> do this, or did the kernel automatically take care of that problem?

The message above means that the kernel has disabled the interrupt due to excessive unhandled interrupts.  Can you provide full dmesg for this system, including getting one of these spurious interrupt events, as well as /proc/interrupts on the host while the guest is running and lspci -vvv for this new system?

Comment 29 wrob0123 2013-04-17 23:32:11 UTC
(In reply to comment #28)

> > The guest has been running the interrupt driven application on the guest now
> > for 20 minutes, so it appears that this workaround is a success. 
> Great
Testing all night went well - thank you very much!

> > I did get the following
> > console message when I started the guest app:
> > 
> > Message from syslogd@fusion at Apr 17 01:11:15 ...
> >  kernel:Disabling IRQ #16
> > 
> > The DDC card is getting assigned IRQ 54 by pci-stub, but I have seen this
> > type of issue before and worked around by disabling a USB port. Do I need to
> > do this, or did the kernel automatically take care of that problem?
> 
> The message above means that the kernel has disabled the interrupt due to
> excessive unhandled interrupts.  Can you provide full dmesg for this system,
> including getting one of these spurious interrupt events, as well as
> /proc/interrupts on the host while the guest is running and lspci -vvv for
> this new system?

See 3 new attachments: dmesg3, proc_interrupts, and lspci_vvv

> > Are these qemu packages compatible with my current kernel?
> > (2.6.32-358.el6.x86_64)

Will they work on older or newer kernels?
Some of our older 6.3 systems are using 2.6.32-279.el6.x86_64

Here are the steps I will use to disable the USB device in the future:
$ echo $"DISABLE usb device to avoid the IRQ 16 problems"
$ echo $"0000:00:1a.0" > /sys/bus/pci/devices/0000:00:1a.0/driver/unbind
$ echo $"8086 3a37" > /sys/bus/pci/drivers/pci-stub/new_id
$ ls /sys/bus/pci/drivers/pci-stub

Comment 30 wrob0123 2013-04-17 23:34:47 UTC
Created attachment 737042 [details]
dmesg including IRQ 16 burp

Comment 31 wrob0123 2013-04-17 23:36:38 UTC
Created attachment 737043 [details]
/proc/interrupts while guest is running the app

Comment 32 wrob0123 2013-04-17 23:37:31 UTC
Created attachment 737044 [details]
lspci -vvv output for new system

Comment 33 Alex Williamson 2013-04-18 20:06:43 UTC
(In reply to comment #29)
> (In reply to comment #28)
> > > Are these qemu packages compatible with my current kernel?
> > > (2.6.32-358.el6.x86_64)
> 
> Will they work on older or newer kernels?
> Some of our older 6.3 systems are using 2.6.32-279.el6.x86_64

They're based on the RHEL6.4 qemu-kvm package, so I can't guarantee they'll work on 6.3.  Ideally they should, but we test integrated version, not mix and match.
 
> Here are the steps I will use to disable the USB device in the future:
> $ echo $"DISABLE usb device to avoid the IRQ 16 problems"
> $ echo $"0000:00:1a.0" > /sys/bus/pci/devices/0000:00:1a.0/driver/unbind
> $ echo $"8086 3a37" > /sys/bus/pci/drivers/pci-stub/new_id
> $ ls /sys/bus/pci/drivers/pci-stub

If this does avoid the problem, I'm suspicious that it's only because without this USB controller there are no devices connected to IRQ16 and it will be masked.  It may not be the USB device generating the spurious interrupts.

The following devices all seem capable of generating interrupts on IRQ16:

pci 0000:00:03.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:07.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1b.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:1c.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:20:01.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:20:03.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:20:05.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:20:07.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16

All of these except 1a.0 (UHCI USB) and 1b.0 (HD Audio) are Root Ports.  The audio device is configured for MSI, so it should not be signaling via INTx.  The USB driver seems not to think its device is signaling the interrupt or it would have responded to it.

Root ports are generally bound to the pcieport driver which will enable MSI, however only 1c.0 is bound to that driver.  Is this intentional?  It's possible that since the device being assigned does not have a reset mechanism that we're doing a secondary bus reset on the root port (20:07.0) and that reset is generating spurious interrupts.  Managing the root port with the pcieport driver may avoid this.

Comment 34 wrob0123 2013-04-19 00:46:53 UTC
(In reply to comment #33) 
> Root ports are generally bound to the pcieport driver which will enable MSI,
> however only 1c.0 is bound to that driver.  Is this intentional?  It's
> possible that since the device being assigned does not have a reset
> mechanism that we're doing a secondary bus reset on the root port (20:07.0)
> and that reset is generating spurious interrupts.  Managing the root port
> with the pcieport driver may avoid this.

We do not have any special config to limit binding only 1c.0 to pcieport. Perhaps this kernel does not know how to properly auto config the Dell R5500. It may be that the 1c.0 gets bound to pcieport by the ethernet driver. Just checked what is currently bound to pcieport:

/sys/bus/pci/drivers/pcieport/0000:00:1c.0 (to ethernet device 3:00.0)
/sys/bus/pci/drivers/pcieport/0000:00:1c.5 (to ethernet device 4:00.0)

Do you think the spurious interrupts were contributing to the failures in the hybrid (msi-host/intx-guest) mode, and do you want me to experiment? If so, please advise on steps how to manage root port 20:07.0 with pcieport.

Comment 35 wrob0123 2013-04-19 05:42:48 UTC
Added kernel boot option pcie_ports=native and got pcieport driver managing all of the root ports. Backed out the change for guest to use the prefer_intx, and took out my rc script commands to disable the USB port with IRQ 16. So now in the host we have MSI interrupts for the device, and the guest is still using INTx.

The original problem is back, and the spurious interrupts did not occur.

Then I set the guest to use prefer_intx: not losing interrupts, but the spurious interrupt problem is back. Let me know if you want to look at this further. I am happy at this point. Thanks again for the qemu updates as a workaround.

Comment 36 Alex Williamson 2013-04-19 21:33:49 UTC
(In reply to comment #35)
> Added kernel boot option pcie_ports=native and got pcieport driver managing
> all of the root ports. Backed out the change for guest to use the
> prefer_intx, and took out my rc script commands to disable the USB port with
> IRQ 16. So now in the host we have MSI interrupts for the device, and the
> guest is still using INTx.
> 
> The original problem is back, and the spurious interrupts did not occur.
> 
> Then I set the guest to use prefer_intx: not losing interrupts, but the
> spurious interrupt problem is back. Let me know if you want to look at this
> further. I am happy at this point. Thanks again for the qemu updates as a
> workaround.

Ok, your card clearly needs the qemu update as its MSI support is either broken or incompatible with the hybrid mode.

I'm still suspicious that the spurious interrupts are coming from the root ports, but I would have expected your final test of using prefer_intx and pcie_ports=native would have alleviated the problem.

By default pcieport will ask the firmware to release port services to OS control using an ACPI interface.  If the firmware does not release control, there's nothing for pcieport to do and it will not bind to the device.  This seems to be the case for the majority of the root ports on this system.  I would have expected that forcing them to be attached with pcie_ports=native would not only put them in MSI mode, but would associate them with a driver to handle the interrupt and avoid the spurious interrupt.  In general though I think that pcie_ports=native would not be recommended as it overrides the OS/firmware handshake and may result in unpredictable behavior if both firmware and the OS attempt to simultaneously manage the port services.  It's possible there's a platform BIOS issue contributing to the spurious interrupts.  That should probably be handled as a separate bug if it becomes annoying.

I'll post the patches for qemu so you should be able to use the prefer_intx wrapper on the native RHEL6.5 binary.  Thanks.

Comment 37 wrob0123 2013-04-24 01:52:59 UTC
Update: vendor provided updated driver to use MSI in the guest
Of course, this avoids the hybrid mode where we lost interrupts
Tested without prefer_intx wrapper and seems to work fine
Did not go back and enable USB port for the IRQ 16 problem
I agree the spurious interrupt problem is a BIOS issue
We have seen that in the past without the DDC card installed

Comment 52 Chao Yang 2013-07-01 10:40:24 UTC
Thank you, Alex.

Steps to reproduce and verify this bug:
1. boot a guest with pci=nomsi in guest kernel line, prefer_msi=on(on by default) in qemu-kvm cli
2. check /proc/interrupts in both host and guest
3. shutdown guest and reboot with pci=nomsi in guest kernel line, prefer_msi=off
4. repeat step 2

With qemu-kvm-0.12.1.2-2.377.el6.x86_64.
After step 2:
In host:
# cat /proc/interrupts | grep -i kvm
  65:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      kvm_assigned_msi_device
  
In guest:
# cat /proc/interrupts | grep eth3
 11:        307         17   IO-APIC-fasteoi   virtio0, eth3

CLI:
/usr/libexec/qemu-kvm -enable-kvm -m 2048 -smp 2,sockets=1,cores=2,threads=1 -drive file=/home/rhel6.4.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:42:0b:38,bus=pci.0 -spice port=8000,disable-ticketing -k en-us -vga qxl -global qxl-vga.vram_size=67108864 -monitor stdio -device pci-assign,host=06:00.0,id=pf -S
                       
After step 4:
In host:
# cat /proc/interrupts | grep kvm
  38:          0          0          0          0          0          0          0          0          0          0          0          0        270          0          0          0  IR-IO-APIC-fasteoi   kvm_assigned_intx_device

In guest:
# cat /proc/interrupts | grep eth3
 11:        552         20   IO-APIC-fasteoi   virtio0, eth3

CLI:
/usr/libexec/qemu-kvm -enable-kvm -m 2048 -smp 2,sockets=1,cores=2,threads=1 -drive file=/home/rhel6.4.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:42:0b:38,bus=pci.0 -spice port=8000,disable-ticketing -k en-us -vga qxl -global qxl-vga.vram_size=67108864 -monitor stdio -device pci-assign,host=06:00.0,id=pf,prefer_msi=off -S


From above:
1. When booting guest with pci=nomsi, prefer_msi=on in qemu-kvm cli, INTx is used in the guest but MSI in host. 
2. When booting guest with pci=nomsi, prefer_msi=off in qemu-kvm cli, INTx is used both in the guest and host. 
So, this issue has fixed correctly.

Comment 54 errata-xmlrpc 2013-11-21 06:46:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1553.html


Note You need to log in before you can comment on or make changes to this bug.