Bug 140331 - (IT_39062) stack overflows can occur on x86_64 under stack pressure when softirq's are handled
stack overflows can occur on x86_64 under stack pressure when softirq's are h...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Neil Horman
Brian Brock
:
Depends On:
Blocks: 132991
  Show dependency treegraph
 
Reported: 2004-11-22 08:03 EST by Neil Horman
Modified: 2007-11-30 17:07 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-18 09:28:37 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
patch to enable low stack checking for softirqs on x86_64 (451 bytes, patch)
2004-11-22 08:05 EST, Neil Horman
no flags Details | Diff

  None (edit)
Description Neil Horman 2004-11-22 08:03:28 EST
Description of problem:
IBM and TI have reported to us that under low stack pressure, the
x86_64 platform can encounter a stack overflow when handling a
softirq.  This is due to the fact that local_bh_enable, as defined for
the x86_64 platform, calls do_softirq_thunk, which in turn enters
do_softirq using the process stack, rather than the normal per-irq
stacks which the softirq task normally uses.

Version-Release number of selected component (if applicable):


How reproducible:
sometimes

Steps to Reproduce:
1.force process stack usage down to a point where > 1k of free stack
remains
2.lock a spinlock with spin_lock_irqsave
3.trigger a softirq (I believe scheduling a tasklet will do this)
4.unlock the spinlock with spin_unlock_irqrestore
  
Actual results:
system will oops on stack overflow

Expected results:
system should not oops


Additional info:
The above reproducer instructions are generic.  The problem was
initially reported in IT numbers 39062 and 46982 as problems with
clearcase, as clearcase makes significant stack usage and can trigger
the problem.  However, any method of eating most of a process stack
can trigger this issue.
Comment 1 Neil Horman 2004-11-22 08:05:27 EST
Created attachment 107177 [details]
patch to enable low stack checking for softirqs on x86_64

This patch solves the issue by adding x86_64 to the list of arches which can
detect low stack pressure, and consequently defer their processing until a
later time.
Comment 2 Ernie Petrides 2004-11-22 15:54:55 EST
Neil has posted a patch to RHKL for this on 11/22.
Comment 3 Peter Martuccelli 2004-12-20 16:47:42 EST
Patch has been accepted into RHEL 3, targeting patch for inclusion
into U5 this week.  Will update when hot fix kernel is available later
this week.
Comment 4 Ernie Petrides 2004-12-22 17:01:32 EST
A fix for this problem has just been committed to the RHEL3 U5
patch pool this afternoon (in kernel version 2.4.21-27.4.EL).
Comment 6 Tim Powers 2005-05-18 09:28:38 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html

Note You need to log in before you can comment on or make changes to this bug.