Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 3 product line. The current stable release is 3.9. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 140331 (IT_39062)

Summary: stack overflows can occur on x86_64 under stack pressure when softirq's are handled
Product: Red Hat Enterprise Linux 3 Reporter: Neil Horman <nhorman>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: jparadis, peterm, petrides, riel, tao
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-05-18 13:28:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 132991    
Attachments:
Description Flags
patch to enable low stack checking for softirqs on x86_64 none

Description Neil Horman 2004-11-22 13:03:28 UTC
Description of problem:
IBM and TI have reported to us that under low stack pressure, the
x86_64 platform can encounter a stack overflow when handling a
softirq.  This is due to the fact that local_bh_enable, as defined for
the x86_64 platform, calls do_softirq_thunk, which in turn enters
do_softirq using the process stack, rather than the normal per-irq
stacks which the softirq task normally uses.

Version-Release number of selected component (if applicable):


How reproducible:
sometimes

Steps to Reproduce:
1.force process stack usage down to a point where > 1k of free stack
remains
2.lock a spinlock with spin_lock_irqsave
3.trigger a softirq (I believe scheduling a tasklet will do this)
4.unlock the spinlock with spin_unlock_irqrestore
  
Actual results:
system will oops on stack overflow

Expected results:
system should not oops


Additional info:
The above reproducer instructions are generic.  The problem was
initially reported in IT numbers 39062 and 46982 as problems with
clearcase, as clearcase makes significant stack usage and can trigger
the problem.  However, any method of eating most of a process stack
can trigger this issue.

Comment 1 Neil Horman 2004-11-22 13:05:27 UTC
Created attachment 107177 [details]
patch to enable low stack checking for softirqs on x86_64

This patch solves the issue by adding x86_64 to the list of arches which can
detect low stack pressure, and consequently defer their processing until a
later time.

Comment 2 Ernie Petrides 2004-11-22 20:54:55 UTC
Neil has posted a patch to RHKL for this on 11/22.

Comment 3 Peter Martuccelli 2004-12-20 21:47:42 UTC
Patch has been accepted into RHEL 3, targeting patch for inclusion
into U5 this week.  Will update when hot fix kernel is available later
this week.

Comment 4 Ernie Petrides 2004-12-22 22:01:32 UTC
A fix for this problem has just been committed to the RHEL3 U5
patch pool this afternoon (in kernel version 2.4.21-27.4.EL).


Comment 6 Tim Powers 2005-05-18 13:28:38 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html