Bug 151222 - smp_apic_timer_interrupt() executes on kernel thread stack
Summary: smp_apic_timer_interrupt() executes on kernel thread stack
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: i586
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Ingo Molnar
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 154907 156322
TreeView+ depends on / blocked
 
Reported: 2005-03-16 03:42 UTC by craig harmer
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Fixed In Version: RHSA-2005-514
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-10-05 12:49:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch to handle timer interrupts on the interrupt stack (1.54 KB, patch)
2005-03-16 03:49 UTC, craig harmer
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2005:514 0 qe-ready SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 2 2005-10-05 04:00:00 UTC

Description craig harmer 2005-03-16 03:42:09 UTC
Description of problem:

Red Hat engineers told VERITAS that in RHEL 4 release on the x86
architecture the kernel stack size would be reduced to 4 Kbyte, but
all interrupts would be handled on a separate stack.

it turns out this is only partially true.  on typical SMP x86
implementations the APIC timer interrupt is performed on the standard
thread stack and can consume a significant amount of stack space
around 376 bytes).  here's an example stack that we collected using a
special kernel we developed that tracks stack consumption:

        [kernel]     smp_apic_timer_interrupt       (+0x84 = x000e0)
        [kernel]     smp_local_timer_interrupt      (+0x14 = 0x000f4)
        [kernel]     update_process_times           (+0x10 = 0x00104)
        [kernel]     scheduler_tick                 (+0x3c = 0x00140)
        [kernel]     rebalance_tick                 (+0x24 = 0x00164)
        [kernel]     load_balance                   (+0x24 = 0x00188)
        [kernel]     wake_up_process                (+0xc  = 0x00194)
        [kernel]     try_to_wake_up                 (+0x48 = 0x001dc)
        [kernel]     activate_task                  (+0x1c = 0x001f8)
        [kernel]     sched_clock                    (+0xc  = 0x00204)
        [kernel]     cycles_2_ns                    (+0x20 = 0x00224)
                                                                     
          
where the format is (+stack_frame_size = cumulative_stack_depth).
                                                                     
          
ignoring the size of the stack frame for smp_apic_timer_interrupt(),
which is incorrect, the stack depth here is 0x224-0xe0 + 0x34 = 0x178
(where 0x2c is the actual amount of stack used by
smp_apic_timer_interrupt(), including the interrupt stack frame).

VERITAS is quite short of stack space on 32 bit Intel and would like
to have as much available as possible.  we've been restructuring our
code to decrease our stack consumption, but still find that our stacks
can be quite deep.  as you've probably guessed from the above trace,
we've developed code to track kernel stack usage and find *all*
instances ofdeep stack consumption (using a gcc compiler option to
insert code in each function entry and exit point).

while we currently believe that we don't have any situations where our
stack is within 376 bytes of overflow (such that an interrupt would
take us over the limit), we're still testing our software stack and
are worried that something might come up in a code path we haven't
adequately exercised yet.  with that in mind, we're making this
request against the possibility/probability that we'll need the
additional stack space.
                                                                     
          
it's relatively easy to make timer interrupts execute on a separate
stack and it the change should have zero measurable impact on the
kernel performance, so we'd like Red Hat to consider making this
change to benefit us and other subsystems that may have deep stacks
(one example being the NFS client code).
                                                                     
          
Note:  timer interrupts are seem to be handled differently for x86 UP
kernels so that we don't see this problem there.  the IA64 kernel
already has a 32 Kbyte stack, so it doesn't have separate stacks for
interrupts and doesn't need them.  x86_64 uses the thread stack for
timer interrupts has a larger stack size so this isn't really an issue
for us on x86_64.

Version-Release number of selected component (if applicable):

kernel-smp-2.6.5-7.109.12.EMP

How reproducible:

run an kernel that instruments stack consumption and look for deep
stacks; typically you'll find 300 odd bytes of stack consumed by
smp_apic_timer_interrupt at the bottom.

we can supply you with a deep stack measuring kernel if you like
(along with the source patches).

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 craig harmer 2005-03-16 03:45:41 UTC
some additional comments from mark hemment:

For older IA-32 system, whose without a local APIC, the timer
interrupt will use the 'standard' interrupt handler (do_IRQ()) so will
(far as I can tell) use a separate stack.  Modern server systems won't
be using this interrupt, so can be ignored in this discussion.

we don't see an issue here for UP kernels because timer interrupt
stacks don't go as deep because update_process_times() isn't called.

arch/i386/kernel/apic.c:
        smp_local_timer_interrupt()
        {
                ...
#ifdef  CONFIG_SMP
                update_process_times(user_mode(regs));
#endif



Comment 2 craig harmer 2005-03-16 03:49:04 UTC
Created attachment 112043 [details]
patch to handle timer interrupts on the interrupt stack

here's a patch mark developed to switch to the interrupt stack for handling
APIC timer interrupts.	Note that this has been tested, but not very heavily.

Comment 3 Ingo Molnar 2005-04-13 14:07:27 UTC
The patch looks OK to me in principle, and i've submitted it for inclusion.

Could you also send it to Andrew Morton & Linus? It makes sense and saves 10%
off the worst-case process-stack footprint. We indeed call quite deep into the
scheduler from the APIC timer interrupt, which makes it special (and different
from the other SMP IPI interrupt routines).

Comment 4 craig harmer 2005-04-13 19:07:51 UTC
sure.  mark or i will submit it.  thanks!

Comment 5 Marty Wesley 2005-05-26 06:59:40 UTC
PM ACK for U2.

Comment 12 Bryan Mason 2005-09-30 23:00:55 UTC
Hi Craig - A beta kernel which we believe resolves this issue is available on
the Red Hat partners FTP site (partners.redhat.com) and the Red Hat Network
(rhn.redhat.com).  Can you download one of the new kernels and see if it
resolves your problem?  Thanks.


Comment 14 Red Hat Bugzilla 2005-10-05 12:49:53 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-514.html



Note You need to log in before you can comment on or make changes to this bug.