Bug 618743 - [RHEL6] malloc's error path deadlocks
Summary: [RHEL6] malloc's error path deadlocks
Status: CLOSED DUPLICATE of bug 676591
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: glibc
Version: 6.1
Hardware: All
OS: Linux
Target Milestone: rc
: ---
Assignee: Andreas Schwab
QA Contact: qe-baseos-tools-bugs
: 664365 (view as bug list)
Depends On:
Blocks: GSS_6_2_PROPOSED
TreeView+ depends on / blocked
Reported: 2010-07-27 15:57 UTC by Adam Jackson
Modified: 2018-11-27 21:45 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 618356
Last Closed: 2011-06-03 13:45:03 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Sourceware 11901 0 None None None Never

Description Adam Jackson 2010-07-27 15:57:02 UTC
+++ This bug was initially created as a clone of Bug #618356 +++

Description of problem:
When launching emacs. I periodically get a "hang" crash. The system gui is hung, I can't do anything. I am able to ssh into the system.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Install RHEL6.0-Snapshot-7-Refresh
2. Login into desktop
3. Open terminal window, emacs /tmp/foo &
Actual results:
[mi] EQ overflowing. The server is probably stuck in an infinite loop.

0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x469138]
1: /usr/bin/Xorg (mieqEnqueue+0x1f4) [0x4a2fe4]
2: /usr/bin/Xorg (xf86PostMotionEventP+0xc4) [0x4739d4]
3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f0c40a09000+0x524f) [0x7f0c40a0e24f]
4: /usr/bin/Xorg (0x400000+0x74897) [0x474897]
5: /usr/bin/Xorg (0x400000+0x10dad3) [0x50dad3]
6: /lib64/libpthread.so.0 (0x31b4800000+0xf4c0) [0x31b480f4c0]
7: /lib64/libc.so.6 (0x31b4400000+0xf0dce) [0x31b44f0dce]
8: /lib64/libc.so.6 (0x31b4400000+0x7c1f8) [0x31b447c1f8]
9: /lib64/libc.so.6 (__libc_malloc+0x62) [0x31b4479af2]
10: /lib64/libc.so.6 (0x31b4400000+0x6fdbb) [0x31b446fdbb]
11: /lib64/libc.so.6 (0x31b4400000+0x75736) [0x31b4475736]
12: /lib64/libc.so.6 (0x31b4400000+0x78e78) [0x31b4478e78]
13: /lib64/libc.so.6 (__libc_malloc+0x6d) [0x31b4479afd]
14: /usr/bin/Xorg (miRegionCreate+0x23) [0x454be3]
15: /usr/bin/Xorg (miRectsToRegion+0x33) [0x455e43]
16: /usr/bin/Xorg (miChangeClip+0x8e) [0x55491e]
17: /usr/lib64/xorg/modules/libexa.so (0x7f0c42287000+0x2c6d) [0x7f0c42289c6d]
18: /usr/bin/Xorg (0x400000+0xd42b4) [0x4d42b4]
19: /usr/bin/Xorg (SetClipRects+0xbf) [0x4368ef]
20: /usr/bin/Xorg (0x400000+0x297a6) [0x4297a6]
21: /usr/bin/Xorg (0x400000+0x2ab5c) [0x42ab5c]
22: /usr/bin/Xorg (0x400000+0x21ffa) [0x421ffa]
23: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x31b441ec5d]
24: /usr/bin/Xorg (0x400000+0x21bb9) [0x421bb9]

Expected results:
Should continue to operate normally

Additional info:

--- Additional comment from jburke@redhat.com on 2010-07-26 14:36:59 EDT ---

nouveau - nVidia Corporation G96 [Quadro FX 580] (rev a1)

--- Additional comment from pm-rhel@redhat.com on 2010-07-26 14:42:38 EDT ---

Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from jburke@redhat.com on 2010-07-26 14:42:56 EDT ---

Created an attachment (id=434494)
xorg log

--- Additional comment from pm-rhel@redhat.com on 2010-07-26 14:57:38 EDT ---

This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 2 Adam Jackson 2010-07-27 17:05:15 UTC
The backtrace above shows malloc calling __libc_message(abort=1) which calls malloc.  This deadlocks.  That's awful.  Hung processes are worse than crashed processes.

Either this needs to use some statically allocated bit of memory in .bss, or use sbrk() and just cope with leaking if the app is insane enough to trap SIGABRT and try to carry on.

Comment 3 Adam Jackson 2010-07-27 17:05:47 UTC
Reassigning back to glibc.  I didn't change component for nothing.

Comment 4 Andreas Schwab 2010-07-28 15:40:27 UTC
This has nothing to do with catching SIGABRT but with having the abort message visible in coredumps.

Comment 6 Adam Jackson 2010-07-29 14:28:29 UTC
(In reply to comment #4)
> This has nothing to do with catching SIGABRT but with having the abort message
> visible in coredumps.    

If the app catches SIGABRT, it may continue instead of exiting.  If it does, and then the abort is raised _again_, we try to free() the old message.  That's what I meant by "use sbrk() and just leak"; we could allocate the storage for the crash message with sbrk(), but we'd have no way of freeing it.

We could call into some internal bit of malloc that assumes the lock has already been taken, but that's dangerous, we're already at this point _because_ malloc's bookkeeping is corrupted.

We could use alloca, but then you'd get no record of the crash if the app does catch SIGABRT.

We could use mmap, but then you'd leak maps instead of leaking heap.

Or we could use a static buffer in .bss, but then that's additional data space in every process.

But really, at this point in a process' death throes, who cares.  Allocate with sbrk because it's easy.  Anyone trying to survive from a SIGABRT is already in a state of sin.

Comment 12 Andreas Schwab 2011-01-10 09:23:13 UTC
*** Bug 664365 has been marked as a duplicate of this bug. ***

Comment 20 Eric Bachalo 2011-02-24 21:57:11 UTC
Moving to RHEL 6.2 release, as no fix is upstream yet.  This will need to be fixed upstream before it is considered for a RHEL release.

Comment 30 Andreas Schwab 2011-06-03 13:45:03 UTC

*** This bug has been marked as a duplicate of bug 676591 ***

Note You need to log in before you can comment on or make changes to this bug.