From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) Description of problem: Some VERITAS have suffered from do_irq's (2048 bytes away from) stack overflow messages. Despite lowering that threshold to 1024, we're still worried by the possibility of overflow, and are working to reduce our stack usage. There's a small change to Linux which would reduce the likelihood of stack overflow for all. It's far from being a complete solution, but a small enough change to be worth making. In many (but not all) drivers, the complex and stack-hungry part of interrupt processing is done in the softirq rather than the hardirq. do_softirq() already defers softirq work to its daemon when swamped by more softirqs while it's working. This patch adds a stack check, deferring all softirq work to the daemon when the stack is too deep. How deep is too deep? Given the hardirq warning at 1k, we estimate the threshold for softirq deferral should be between 2k and 3k, and have set 2560 here. Much lower than that would make it ineffective, much higher than that would impact performance. This patch differs slightly from the patch we offered earlier for RHEL3.0: extending it from i386 to other architectures, excepting parisc and x86_64; with threshold 5120 on 64-bit arches. --- 2.4.9-e.27/kernel/softirq.c Tue Sep 23 16:46:51 2003 +++ linux/kernel/softirq.c Wed Sep 24 20:03:09 2003 @@ -17,6 +17%2
seems liek the patch got cut off...Also, we made a number of changes in U3 to address stack overflow in 2.1. The e.34 kernel might be worth trying.
--- 2.4.9-e.27/kernel/softirq.c Tue Sep 23 16:46:51 2003 +++ linux/kernel/softirq.c Wed Sep 24 20:03:09 2003 @@ -17,6 +17,9 @@ #include <linux/init.h> #include <linux/tqueue.h> +/* Defer softirqs to ksoftirqd if free stack less than this */ +#define STACK_DEFER_THRESHOLD (80 * BITS_PER_LONG) + /* - No shared variables, all the data are CPU local. - If a softirq needs serialization, let it serialize itself @@ -75,6 +78,21 @@ if (pending) { struct softirq_action *h; +#if !defined(CONFIG_PARISC) && !defined(CONFIG_X86_64) + { + unsigned long esp = (unsigned long) &esp; + unsigned long tsk = (unsigned long) current; + + if (unlikely(esp < tsk + sizeof(struct task_struct) + + STACK_DEFER_THRESHOLD) && esp >= tsk && + tsk != (unsigned long) ksoftirqd_task (cpu)) { + wakeup_softirqd(cpu); + local_irq_restore(flags); + return; + } + } +#endif /* !CONFIG_PARISC !CONFIG_X86_64 */ + mask = ~pending; local_bh_disable(); restart:
Just cleaning out my bug list ;)
This bug is filed against RHEL2.1, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you.