Red Hat Bugzilla – Bug 618743
[RHEL6] malloc's error path deadlocks
Last modified: 2016-11-24 07:40:47 EST
+++ This bug was initially created as a clone of Bug #618356 +++
Description of problem:
When launching emacs. I periodically get a "hang" crash. The system gui is hung, I can't do anything. I am able to ssh into the system.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install RHEL6.0-Snapshot-7-Refresh
2. Login into desktop
3. Open terminal window, emacs /tmp/foo &
[mi] EQ overflowing. The server is probably stuck in an infinite loop.
0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x469138]
1: /usr/bin/Xorg (mieqEnqueue+0x1f4) [0x4a2fe4]
2: /usr/bin/Xorg (xf86PostMotionEventP+0xc4) [0x4739d4]
3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f0c40a09000+0x524f) [0x7f0c40a0e24f]
4: /usr/bin/Xorg (0x400000+0x74897) [0x474897]
5: /usr/bin/Xorg (0x400000+0x10dad3) [0x50dad3]
6: /lib64/libpthread.so.0 (0x31b4800000+0xf4c0) [0x31b480f4c0]
7: /lib64/libc.so.6 (0x31b4400000+0xf0dce) [0x31b44f0dce]
8: /lib64/libc.so.6 (0x31b4400000+0x7c1f8) [0x31b447c1f8]
9: /lib64/libc.so.6 (__libc_malloc+0x62) [0x31b4479af2]
10: /lib64/libc.so.6 (0x31b4400000+0x6fdbb) [0x31b446fdbb]
11: /lib64/libc.so.6 (0x31b4400000+0x75736) [0x31b4475736]
12: /lib64/libc.so.6 (0x31b4400000+0x78e78) [0x31b4478e78]
13: /lib64/libc.so.6 (__libc_malloc+0x6d) [0x31b4479afd]
14: /usr/bin/Xorg (miRegionCreate+0x23) [0x454be3]
15: /usr/bin/Xorg (miRectsToRegion+0x33) [0x455e43]
16: /usr/bin/Xorg (miChangeClip+0x8e) [0x55491e]
17: /usr/lib64/xorg/modules/libexa.so (0x7f0c42287000+0x2c6d) [0x7f0c42289c6d]
18: /usr/bin/Xorg (0x400000+0xd42b4) [0x4d42b4]
19: /usr/bin/Xorg (SetClipRects+0xbf) [0x4368ef]
20: /usr/bin/Xorg (0x400000+0x297a6) [0x4297a6]
21: /usr/bin/Xorg (0x400000+0x2ab5c) [0x42ab5c]
22: /usr/bin/Xorg (0x400000+0x21ffa) [0x421ffa]
23: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x31b441ec5d]
24: /usr/bin/Xorg (0x400000+0x21bb9) [0x421bb9]
Should continue to operate normally
--- Additional comment from firstname.lastname@example.org on 2010-07-26 14:36:59 EDT ---
nouveau - nVidia Corporation G96 [Quadro FX 580] (rev a1)
--- Additional comment from email@example.com on 2010-07-26 14:42:38 EDT ---
Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.
--- Additional comment from firstname.lastname@example.org on 2010-07-26 14:42:56 EDT ---
Created an attachment (id=434494)
--- Additional comment from email@example.com on 2010-07-26 14:57:38 EDT ---
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.
** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **
The backtrace above shows malloc calling __libc_message(abort=1) which calls malloc. This deadlocks. That's awful. Hung processes are worse than crashed processes.
Either this needs to use some statically allocated bit of memory in .bss, or use sbrk() and just cope with leaking if the app is insane enough to trap SIGABRT and try to carry on.
Reassigning back to glibc. I didn't change component for nothing.
This has nothing to do with catching SIGABRT but with having the abort message visible in coredumps.
(In reply to comment #4)
> This has nothing to do with catching SIGABRT but with having the abort message
> visible in coredumps.
If the app catches SIGABRT, it may continue instead of exiting. If it does, and then the abort is raised _again_, we try to free() the old message. That's what I meant by "use sbrk() and just leak"; we could allocate the storage for the crash message with sbrk(), but we'd have no way of freeing it.
We could call into some internal bit of malloc that assumes the lock has already been taken, but that's dangerous, we're already at this point _because_ malloc's bookkeeping is corrupted.
We could use alloca, but then you'd get no record of the crash if the app does catch SIGABRT.
We could use mmap, but then you'd leak maps instead of leaking heap.
Or we could use a static buffer in .bss, but then that's additional data space in every process.
But really, at this point in a process' death throes, who cares. Allocate with sbrk because it's easy. Anyone trying to survive from a SIGABRT is already in a state of sin.
*** Bug 664365 has been marked as a duplicate of this bug. ***
Moving to RHEL 6.2 release, as no fix is upstream yet. This will need to be fixed upstream before it is considered for a RHEL release.
*** This bug has been marked as a duplicate of bug 676591 ***