Bug 448574 - [MRG] Hit BUG: MAX_STACK_TRACE_ENTRIES too low! when booting kernel-rt-debug-
[MRG] Hit BUG: MAX_STACK_TRACE_ENTRIES too low! when booting kernel-rt-debug-...
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel (Show other bugs)
x86_64 All
low Severity medium
: 1.0.3
: ---
Assigned To: Red Hat Real Time Maintenance
Depends On:
  Show dependency treegraph
Reported: 2008-05-27 13:40 EDT by IBM Bug Proxy
Modified: 2008-10-07 15:21 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-10-07 15:21:36 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
The "dmesg" output showing the BUG() and resulting stack trace (55.61 KB, text/plain)
2008-05-27 13:41 EDT, IBM Bug Proxy
no flags Details
Patch to increase MAX_STACK_TRACE_ENTRIES (722 bytes, patch)
2008-08-19 12:34 EDT, Clark Williams
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 44149 None None None Never

  None (edit)
Description IBM Bug Proxy 2008-05-27 13:40:57 EDT
=Comment: #0=================================================
TIMOTHY R. CHAVEZ <chavezt@us.ibm.com> - 2008-04-16 16:12 EDT
Problem description:
During a test-boot of a diskless LS21 using the kernel
with a modified LSI MPP/RDAC driver, I got a "BUG: MAX_STACK_TRACE_ENTRIES too
low!" followed by a stack trace (attached).  With the exception of the RDAC
driver, the kernel is effectively the same kernel as
the standard MRG kernel (no custom patches).  However, it
should be noted that a standard MRG kernel has not been
test-booted, yet.  The machine does not hang and appears to be operational /

If this is not an installation problem,
       Describe any custom patches installed.

No custom patches applied to kernel.  However, a custom LSI/MPP RDAC driver was
built and installed for this kernel.

       Provide output from "uname -a", if possible:

Linux elm3c31 #1 SMP PREEMPT RT Wed Apr 16 00:47:35 EDT
2008 x86_64 x86_64 x86_64 GNU/Linux

Hardware Environment
    Machine type (p650, x235, SF2, etc.): LS21
    Cpu type (Power4, Power5, IA-64, etc.): Dual-Core AMD Opteron(tm) Processor
    Describe any special hardware you think might be relevant to this problem:
Possibly the dual QLogic 4GB HBA cards attached the machine(?), but I've not
test-booted the debug kernel on any other configuration, so...

Please provide contact information if the submitter is not the primary contact.

Is this reproducible? Yes
    If so, how long does it (did it) take to reproduce it?
    Describe the steps:
Boot the system with the kernel.

Is the system (not just the application) hung? No
    If so, describe how you determined this:

Additional information:

This environment is an LS21 attached to a DS4700 via a couple QLogic 4GB HBA
cards (thus the need for RDAC) and has no local storage.
=Comment: #1=================================================
TIMOTHY R. CHAVEZ <chavezt@us.ibm.com> - 2008-04-16 16:22 EDT

The "dmesg" output showing the BUG() and resulting stack trace

=Comment: #2=================================================
TIMOTHY R. CHAVEZ <chavezt@us.ibm.com> - 2008-04-16 17:29 EDT
I booted the vanilla, trace, and rt kernels on this same system / hardware
configuration without hitting this bug.  I'll attempt to boot the debug kernel
on a system with a local storage configuration tomorrow morning and report my
=Comment: #3=================================================
TIMOTHY R. CHAVEZ <chavezt@us.ibm.com> - 2008-04-22 10:44 EDT
Just a note,

Red Hat also seeing this in testing

From Clark Williams @ Red Hat:

This isn't a CONFIG_ option. Its a value defined in lockdep_internals.h and
currently is defined as:


That's pretty big...
Comment 1 IBM Bug Proxy 2008-05-27 13:41:02 EDT
Created attachment 306805 [details]
The &quot;dmesg&quot; output showing the BUG() and resulting stack trace
Comment 2 IBM Bug Proxy 2008-06-18 19:08:41 EDT
------- Comment From jstultz@us.ibm.com 2008-06-18 19:02 EDT-------
Has this issue been seen recently?
Comment 3 IBM Bug Proxy 2008-06-30 12:08:45 EDT
------- Comment From chavezt@us.ibm.com 2008-06-30 12:00 EDT-------
I haven't see it, but then again, I haven't been booting from SAN recently.
Maybe Keith has seen it?  I'm adding him to CC list.
Comment 4 IBM Bug Proxy 2008-08-04 07:00:33 EDT
I have seen this problem on a non-SAN machine while trying to recreate bug
46204. The system took a really long time (45 minutes) to come up. BUG message
seen was:

turning off the locking correctness validator.
Pid: 2112, comm: ip Not tainted #1

Call Trace:
[<ffffffff810146b5>] ? save_stack_trace+0x2a/0x49
[<ffffffff8105d851>] save_trace+0x93/0x9b
[<ffffffff8105d8d7>] add_lock_to_list+0x7e/0xac
[<ffffffff81060eb9>] __lock_acquire+0xb43/0xcdc
[<ffffffff81067443>] ? rt_mutex_slowtrylock+0x18/0x85
[<ffffffff810610e0>] lock_acquire+0x8e/0xb2
[<ffffffff81067443>] ? rt_mutex_slowtrylock+0x18/0x85
[<ffffffff812a7bd2>] __spin_lock_irqsave+0x40/0x73
[<ffffffff81067443>] rt_mutex_slowtrylock+0x18/0x85
[<ffffffff812a5694>] rt_mutex_trylock+0x9/0xb
[<ffffffff812a7105>] rt_spin_lock+0x31/0x56
[<ffffffff8127e194>] ip_mc_inc_group+0x176/0x232
[<ffffffff8127e296>] ip_mc_up+0x46/0x64
[<ffffffff81279947>] inetdev_event+0x263/0x470
[<ffffffff810882d0>] ? __rcu_read_unlock+0x8c/0x95
[<ffffffff812aa943>] notifier_call_chain+0x33/0x5b
[<ffffffff81058001>] __raw_notifier_call_chain+0x9/0xb
[<ffffffff81058012>] raw_notifier_call_chain+0xf/0x11
[<ffffffff812305c2>] call_netdevice_notifiers+0x16/0x18
[<ffffffff81231f2f>] dev_open+0x80/0x88
[<ffffffff8123072f>] dev_change_flags+0xaf/0x16b
[<ffffffff81279ee3>] devinet_ioctl+0x267/0x5f2
[<ffffffff8127a686>] inet_ioctl+0x82/0xa0
[<ffffffff81223dc1>] sock_ioctl+0x1e7/0x20c
[<ffffffff810ce955>] do_ioctl+0x2d/0x83
[<ffffffff810cec20>] vfs_ioctl+0x275/0x292
[<ffffffff812a6a5b>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff810cec94>] sys_ioctl+0x57/0x7b
[<ffffffff8100c248>] ? system_call+0xb8/0xef
[<ffffffff8100c27f>] system_call_ret+0x0/0x6d

INFO: lockdep is turned off.
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
.. [<ffffffff812a7bb5>] .... __spin_lock_irqsave+0x23/0x73
.....[<ffffffff81067443>] ..   ( <= rt_mutex_slowtrylock+0x18/0x85)
Comment 5 IBM Bug Proxy 2008-08-19 11:21:34 EDT
While working on bug #46204 (RH459478), Peter Zijlstra suggested trying a few
patches recently committed to Linus' tree to see if it helps solve this problem.
They did not. Then, he asked me to try higher values of MAX_STACK_TRACE_ENTRIES.
I changed MAX_STACK_TRACE_ENTRIES to 1.25 times (327680) it's
current value and I still saw the problem. When I made it 1.5 times
(393216), I did not see the problem. I have reported these to Peter in an e-mail
as well. He needs to decide whether it is okay to increase this value.
Comment 6 Clark Williams 2008-08-19 12:34:02 EDT
Created attachment 314559 [details]

Added patch to increase MAX_STACK_TRACE_ENTRIES by 1.5 (to 393216) as per Sripathi's tests. This should go into our -78 kernel build
Comment 8 David Sommerseth 2008-09-26 05:13:42 EDT
Verified that patch (https://bugzilla.redhat.com/attachment.cgi?id=314559) is implemented as mrg-rt.git commit 842bb285febde3ae296de13c8c50da52e56878f7.  Available in mrg-rt-

Bug reproduced using and went away with
Comment 10 errata-xmlrpc 2008-10-07 15:21:36 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.