Bug 112577

Summary: softirq to assist with stack overflows
Product: Red Hat Enterprise Linux 2.1 Reporter: sheryl sage <sheryl.sage>
Component: kernelAssignee: Rik van Riel <riel>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: riel, summer, tao
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 19:23:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description sheryl sage 2003-12-23 15:12:14 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)

Description of problem:
Some VERITAS have suffered from do_irq's (2048 bytes away from) stack 
overflow messages.  Despite lowering that threshold to 1024, we're 
still worried by the possibility of overflow, and are working to 
reduce our stack usage.

There's a small change to Linux which would reduce the likelihood of 
stack overflow for all.  It's far from being a complete solution, but 
a small enough change to be worth making.

In many (but not all) drivers, the complex and stack-hungry part of 
interrupt processing is done in the softirq rather than the hardirq. 
do_softirq() already defers softirq work to its daemon when swamped 
by more softirqs while it's working.  This patch adds a stack check, 
deferring all softirq work to the daemon when the stack is too deep.

How deep is too deep?  Given the hardirq warning at 1k, we estimate 
the threshold for softirq deferral should be between 2k and 3k, and 
have set 2560 here.  Much lower than that would make it ineffective, 
much higher than that would impact performance.

This patch differs slightly from the patch we offered earlier for 
RHEL3.0: extending it from i386 to other architectures, excepting 
parisc and x86_64; with threshold 5120 on 64-bit arches.

--- 2.4.9-e.27/kernel/softirq.c	Tue Sep 23 16:46:51 2003
+++ linux/kernel/softirq.c	Wed Sep 24 20:03:09 2003
@@ -17,6 +17%2

Comment 1 Jason Baron 2003-12-23 16:56:03 UTC
seems liek the patch got cut off...Also, we made a number of changes
in U3 to address stack overflow in 2.1. The e.34 kernel might be worth
trying.

Comment 2 sheryl sage 2003-12-23 17:07:58 UTC
--- 2.4.9-e.27/kernel/softirq.c	Tue Sep 23 16:46:51 2003
+++ linux/kernel/softirq.c	Wed Sep 24 20:03:09 2003
@@ -17,6 +17,9 @@
 #include <linux/init.h>
 #include <linux/tqueue.h>
 
+/* Defer softirqs to ksoftirqd if free stack less than this */
+#define STACK_DEFER_THRESHOLD	(80 * BITS_PER_LONG)
+
 /*
    - No shared variables, all the data are CPU local.
    - If a softirq needs serialization, let it serialize itself
@@ -75,6 +78,21 @@
 	if (pending) {
 		struct softirq_action *h;
 
+#if !defined(CONFIG_PARISC) && !defined(CONFIG_X86_64)
+		{
+			unsigned long esp = (unsigned long) &esp;
+			unsigned long tsk = (unsigned long) current;
+
+			if (unlikely(esp < tsk + sizeof(struct 
task_struct) +
+			    STACK_DEFER_THRESHOLD) && esp >= tsk &&
+			    tsk != (unsigned long) ksoftirqd_task
(cpu)) {
+				wakeup_softirqd(cpu);
+				local_irq_restore(flags);
+				return;
+			}
+		}
+#endif /* !CONFIG_PARISC !CONFIG_X86_64 */
+
 		mask = ~pending;
 		local_bh_disable();
 restart:


Comment 5 Rik van Riel 2004-06-09 03:13:51 UTC
Just cleaning out my bug list ;)

Comment 6 RHEL Program Management 2007-10-19 19:23:30 UTC
This bug is filed against RHEL2.1, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products.  Since
this bug does not meet that criteria, it is now being closed.

For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/

If you feel this bug is indeed mission critical, please contact your
support representative.  You may be asked to provide detailed
information on how this bug is affecting you.