Bug 140331 (IT_39062) - stack overflows can occur on x86_64 under stack pressure when softirq's are handled
Summary: stack overflows can occur on x86_64 under stack pressure when softirq's are h...
Alias: IT_39062
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
(Show other bugs)
Version: 3.0
Hardware: x86_64 Linux
Target Milestone: ---
Assignee: Neil Horman
QA Contact: Brian Brock
Depends On:
Blocks: 132991
TreeView+ depends on / blocked
Reported: 2004-11-22 13:03 UTC by Neil Horman
Modified: 2007-11-30 22:07 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-05-18 13:28:37 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
patch to enable low stack checking for softirqs on x86_64 (451 bytes, patch)
2004-11-22 13:05 UTC, Neil Horman
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2005:294 normal SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 5 2005-05-18 04:00:00 UTC

Description Neil Horman 2004-11-22 13:03:28 UTC
Description of problem:
IBM and TI have reported to us that under low stack pressure, the
x86_64 platform can encounter a stack overflow when handling a
softirq.  This is due to the fact that local_bh_enable, as defined for
the x86_64 platform, calls do_softirq_thunk, which in turn enters
do_softirq using the process stack, rather than the normal per-irq
stacks which the softirq task normally uses.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.force process stack usage down to a point where > 1k of free stack
2.lock a spinlock with spin_lock_irqsave
3.trigger a softirq (I believe scheduling a tasklet will do this)
4.unlock the spinlock with spin_unlock_irqrestore
Actual results:
system will oops on stack overflow

Expected results:
system should not oops

Additional info:
The above reproducer instructions are generic.  The problem was
initially reported in IT numbers 39062 and 46982 as problems with
clearcase, as clearcase makes significant stack usage and can trigger
the problem.  However, any method of eating most of a process stack
can trigger this issue.

Comment 1 Neil Horman 2004-11-22 13:05:27 UTC
Created attachment 107177 [details]
patch to enable low stack checking for softirqs on x86_64

This patch solves the issue by adding x86_64 to the list of arches which can
detect low stack pressure, and consequently defer their processing until a
later time.

Comment 2 Ernie Petrides 2004-11-22 20:54:55 UTC
Neil has posted a patch to RHKL for this on 11/22.

Comment 3 Peter Martuccelli 2004-12-20 21:47:42 UTC
Patch has been accepted into RHEL 3, targeting patch for inclusion
into U5 this week.  Will update when hot fix kernel is available later
this week.

Comment 4 Ernie Petrides 2004-12-22 22:01:32 UTC
A fix for this problem has just been committed to the RHEL3 U5
patch pool this afternoon (in kernel version 2.4.21-27.4.EL).

Comment 6 Tim Powers 2005-05-18 13:28:38 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.