Bug 442736 - launching too many guests panics with "No available IRQ to bind to: increase NR_IRQS!"
Summary: launching too many guests panics with "No available IRQ to bind to: increase ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.2
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Bill Burns
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks: RHEL5u2_relnotes 448753 RHEL5u3_relnotes
TreeView+ depends on / blocked
 
Reported: 2008-04-16 14:49 UTC by Bryn M. Reeves
Modified: 2018-10-20 02:57 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
When the Dynamic IRQs available for guests virtual machines were exhausted, the dom0 kernel would crash. In this update, the crash condition has been fixed, and the number of available IRQs has been increased, which resolves this issue.
Clone Of:
Environment:
Last Closed: 2009-01-20 19:45:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Posted patch. (2.38 KB, patch)
2008-10-23 19:48 UTC, Bill Burns
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:0225 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.3 kernel security and bug fix update 2009-01-20 16:06:24 UTC

Description Bryn M. Reeves 2008-04-16 14:49:08 UTC
Description of problem:
The number of Xen domains that can be started is determined in part by the
number of available dynamic IRQs and the number of IRQs used by each guest. This
is limited by the compile time constant NR_DYNIRQS:

#define NR_DYNIRQS             256

When this number is exceeded, find_unbound_irq() will fail and panic the system:

+static int find_unbound_irq(void)
+{
+       int irq;
+
+       /* Only allocate from dynirq range */
+       for (irq = DYNIRQ_BASE; irq < NR_IRQS; irq++)
+               if (irq_bindcount[irq] == 0)
+                       break;
+
+       if (irq == NR_IRQS)
+               panic("No available IRQ to bind to: increase NR_IRQS!n");
+
+       return irq;
+}

With typical guests needing a minimum of two interrupts this places an upper
bound on the number of guests that can be created.

Version-Release number of selected component (if applicable):
2.6.18-86.el5xen

How reproducible:
100%

Steps to Reproduce:
1. Boot a xen dom0
2. Configure a large number of guests
3. Start booting guests one at a time
  
Actual results:
Eventually (assuming sufficient memory / I/O resources are available) the Dom0
guest will panic:

Kernel panic - not syncing: No available IRQ to bind to: increase NR_IRQS!
(XEN) Domain 0 crashed: 'noreboot' set - not rebooting.

Expected results:
No panic.

Additional info:

Comment 2 Bryn M. Reeves 2008-04-16 14:51:38 UTC
Upstream discussion:

http://lists.xensource.com/archives/html/xen-devel/2006-12/msg00311.html

Comment 6 Chris Lalancette 2008-04-18 01:23:19 UTC
OK.  I briefly took a look at this.  Upstream xen has since changed it so that
if you run out of IRQs, you don't panic; this was put into xen-unstable c/s
12790.  I think we should definitely take that patch.

In the thread mentioned in Comment #2, Keir said that it would be nice to
allocate IRQs dynamically, make it a config option, or have a boot option that
you could pass to increase the number.  I think allocating the IRQs dynamically
is going to be a non starter, since it would likely require changes to the IRQ
code that Xen shares with the bare metal kernel.  So that leaves us with 2 options:

1.  Have a boot time option that allows you to increase the number of IRQs at
boot time.

2.  Just increase NR_DYNIRQS

I like option 2, since it is a better user experience, but we can consider 1 as
well.

Chris Lalancette

Comment 8 Don Domingo 2008-04-27 23:14:58 UTC
adding to RHEL5.2 release notes updates:

<quote>
    * dom0 has a system-wide IRQ (interrupt request line) limit of 256, which is
consumed as follows:
          o 3 per physical CPU.
          o 1 per guest device (i.e. NIC or block device)

      When the IRQ limit is reached, the system will crash. As such, check your
IRQ consumption to make sure that the number of guests you create (and their
respective block devices) do not exhaust the IRQ limit.
</quote>

please advise if any further revisions are required. thanks!

Comment 9 Issue Tracker 2008-04-28 15:38:12 UTC
----- Additional Comments From krister.com  2008-04-28 11:09 EDT
-------
Should there be something added to let the user know how to check their
used
Dynamic IRQs?  I worry that without this the user might not know how to
determine the number of Dynamic IRQs they have used.  I ran this in dom0
on a
blade with 2 guests running:

[root@host ~]# grep Dynamic-irq /proc/interrupts | wc -l
30 


This event sent from IssueTracker by jkachuck 
 issue 173656

Comment 10 Don Domingo 2008-04-29 23:14:39 UTC
thanks, appending to note:

<quote>
To determine how many IRQs you are currently consuming, run the command grep
Dynamic-irq /proc/interrupts | wc -l.
</quote>

please advise if any further revisions are required. thanks!

Comment 11 RHEL Program Management 2008-06-09 21:58:22 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 12 Ryan Lerch 2008-08-11 01:24:22 UTC
Tracking this bug for the Red Hat Enterprise Linux 5.3 Release Notes. 

This Release Note is currently located in the Known Issues section.

Comment 13 Ryan Lerch 2008-08-11 01:24:22 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Comment 14 Bill Burns 2008-08-11 16:54:48 UTC
I have a test image that has the fix for the panic as well as an increase in the number of IRQs (256 more). Unfortunately the increase break the kernel abi and some further work is needed to see if that could be overcome. While this is being
looked at could you please test this kernel to see if it solves the crash issue and what number of guests can be started with this change? The image is people.redhat.com/bburns/kernel-xen-2.6.18-103.el5IRQFIX.x86_64.rpm

Thanks.

Comment 15 Don Zickus 2008-09-13 01:46:47 UTC
in kernel-2.6.18-113.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 20 Bill Burns 2008-10-23 19:47:17 UTC
Putting this back to assigned. Testing of the pre-beta kernels has shown that the fix for the crash when exhausting the IRQs was not effective. It added the error path logic but there was a flaw in the implementation using unsigned variables and comparing them for < 0. Upstream has fixed and it's a small incremental change to incorporate the fix.

Comment 21 Bill Burns 2008-10-23 19:48:23 UTC
Created attachment 321335 [details]
Posted patch.

Patch to fix checking for negative IRQ return values.

Comment 22 Don Zickus 2008-10-29 16:17:40 UTC
in kernel-2.6.18-121.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 26 Bill Burns 2008-11-07 15:00:52 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1,9 +1,2 @@
 (all architectures)
-dom0 has a system-wide IRQ (interrupt request line) limit of 256, which is consumed as follows:
+When the Dynamic IRQs available for guests virtual machines were exhausted the domain 0 kernel would crash. This patch fixed the crash condition and also greatly increased the number of availble IRQs for x86_64 platforms.-
-    * 3 per physical CPU.
-    * 1 per guest device (i.e. NIC or block device)
-
-When the IRQ limit is reached, the system will crash. As such, check your IRQ consumption to make sure that the number of guests you create (and their respective block devices) do not exhaust the IRQ limit.
-
-To determine how many IRQs you are currently consuming, run the command grep Dynamic-irq /proc/interrupts | wc -l.

Comment 27 James Takahashi 2008-11-08 02:27:13 UTC
With much help from rharper, I finally have a small, ramdisk-based guest config suitable for creating many guest instances on my RHEL5.3 beta (2.6.18-121.el5xen #1 SMP Mon Oct 27 22:03:03 EDT 2008 x86_64) system:

  [root@elm3c13 xen]# cat /xen/disk2/etc-xen-share/test1
  name = "test1"
  maxmem = 64
  memory = 64
  vcpus = 1
  kernel = "/etc/xen/vmlinuz-autobench-xen"
  root = "/dev/xvda"
  extra = "console=xvc0"
  on_poweroff = "destroy"
  on_reboot = "restart"
  on_crash = "preserve"
  vif = [ '' ]
  disk = [ 'tap:aio:/etc/xen/initrd-1.1-i386.img,xvda,r' ]

When I tried to create a bunch of guests based upon this config, I ran into 'Error: (12, 'Cannot allocate memory')' messages at the 89th guest, well before IRQs were exhausted ('grep Dynamic-irq /proc/interrupts | wc -l' reports only 202, and I had 26 even before creating the first guest).  I also saw some '(XEN) Cannot handle page request order 0!' messages on the console while these failures were occurring.

The system has plenty of free memory (MemTotal: 33554432 kB; MemFree: 32273780 kB), so this error is confusing.  Am I doing something wrong?

Comment 28 James Takahashi 2008-11-08 02:29:02 UTC
(In reply to comment #27)
>  my RHEL5.3 beta
> (2.6.18-121.el5xen #1 SMP Mon Oct 27 22:03:03 EDT 2008 x86_64) system:

Er, I meant to say snap1.

Comment 29 James Takahashi 2008-11-08 02:33:56 UTC
Oh, a couple of other data points I forgot to mention.  My first attempt to workaround this issue was to reduce maxmem and memory from 64 to 32.  But the system failed in exactly the same way, and still at the 89th guest.

Then I thought I might perhaps consume IRQs more quickly by allocating more CPUs per guest.  But bumping vcpus from 1 to 4 caused the system to hit the 'Cannot allocate memory' failure even earlier -- at the 68th rather than the 89th guest instance.

Comment 30 Bill Burns 2008-11-10 14:02:18 UTC
Yes, it's unlikely that you will be able to exhaust the IRQs since the patch increased them by quite a large margin. It is assumed that with the IRQ limit out of the way the next limitation would be hit. Please file a separate bug report with the details. I think for verification of this bug, getting past the 70 or so guests that used to fail is sufficient. Thanks for the testing!

Comment 32 Ryan Lerch 2008-11-12 02:50:42 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1,2 +1 @@
-(all architectures)
+When the Dynamic IRQs available for guests virtual machines were exhausted, the dom0 kernel would crash. In this update, the crash condition has been fixed, and  the number of available IRQs has been increased, which resolves this issue.-When the Dynamic IRQs available for guests virtual machines were exhausted the domain 0 kernel would crash. This patch fixed the crash condition and also greatly increased the number of availble IRQs for x86_64 platforms.

Comment 33 James Takahashi 2008-11-12 21:01:24 UTC
For the sake of completeness, I tried my test scenario on a plain vanilla RHEL5.2 Xen installation.  As I'd hoped, I saw the following:

  Kernel panic - not syncing: No available IRQ to bind to: increase NR_IRQS!

   (XEN) Domain 0 crashed: rebooting machine in 5 seconds.

This happens at the 116th guest, which attempts to use the 257th IRQ.

I then repeated the experiment, but first installed the -119 kernel -- that was the last test kernel provided prior to moving my testing to RHEL5.3.  With that setup, I observed that I can allocate at least 256 guests with at least 512 IRQs without crashing.  If desired, I can rerun that setup to the next power of 2 to see what happens.

In both of the RHEL5.2 cases, I see the following errors starting at the 100th guest:

  tap tap-312-51712: 2 getting info
  blk_tap: Error initialising /dev/xen/blktap - No more devices
  blk_tap: Error initialising /dev/xen/blktap - No more devices
  <last msg repeats about 8 times per guest-creation attempt>

The guests get created, but eventually get marked 'crashed'.  

Bottom line is that it seems that we used to be able to get 99 usable guests with RHEL5.2, whereas with RHEL5.3, we can only get 88, at least based upon this particular guest config.  Not saying that's a problem -- just providing an FYI.

Unless there are requests for further tests, I think this bug can be closed.  I'll open a separate bug to track the 'Cannot allocate memory' issue.

Comment 34 Bill Burns 2008-11-12 21:40:56 UTC
Thanks for the testing. Please do open the new BZ to track the memory issue. I think with the existing testing and my forcing IRQ exhaustion via a kernel hack I am confident this issue is all set.

Comment 35 Chris Lalancette 2008-11-14 11:29:02 UTC
(In reply to comment #33)
> In both of the RHEL5.2 cases, I see the following errors starting at the 100th
> guest:
> 
>   tap tap-312-51712: 2 getting info
>   blk_tap: Error initialising /dev/xen/blktap - No more devices
>   blk_tap: Error initialising /dev/xen/blktap - No more devices
>   <last msg repeats about 8 times per guest-creation attempt>

If I remember correctly, you are running into blktap limitations here.  There is a hard-coded 100 disk limit currently in blktap, so you get the "No more devices" message when you try to add more disks and it doesn't find any more room in the array.  You'll probably have better luck using LVM backed guests, since there is no such limitation there.

If we really want to support more blktap disks (and we probably do), we should open up another bug to up that limit in blktap (but this will have to be for later releases).

Chris Lalancette

Comment 39 errata-xmlrpc 2009-01-20 19:45:56 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html


Note You need to log in before you can comment on or make changes to this bug.