|Summary:||[RHEL6] malloc's error path deadlocks|
|Product:||Red Hat Enterprise Linux 6||Reporter:||Adam Jackson <ajax>|
|Component:||glibc||Assignee:||Andreas Schwab <schwab>|
|Status:||CLOSED DUPLICATE||QA Contact:||qe-baseos-tools|
|Version:||6.1||CC:||cmeadors, dgregor, drepper, ebachalo, fweimer, gholms, jakub, jburke, jkurik, jwest, mgordon, moshiro, myamazak|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2011-06-03 13:45:03 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
Description Adam Jackson 2010-07-27 15:57:02 UTC
+++ This bug was initially created as a clone of Bug #618356 +++ Description of problem: When launching emacs. I periodically get a "hang" crash. The system gui is hung, I can't do anything. I am able to ssh into the system. Version-Release number of selected component (if applicable): xorg-x11-server-Xorg-1.7.7-21.el6.x86_64 How reproducible: Intermittant Steps to Reproduce: 1. Install RHEL6.0-Snapshot-7-Refresh 2. Login into desktop 3. Open terminal window, emacs /tmp/foo & Actual results: [mi] EQ overflowing. The server is probably stuck in an infinite loop. Backtrace: 0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x469138] 1: /usr/bin/Xorg (mieqEnqueue+0x1f4) [0x4a2fe4] 2: /usr/bin/Xorg (xf86PostMotionEventP+0xc4) [0x4739d4] 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f0c40a09000+0x524f) [0x7f0c40a0e24f] 4: /usr/bin/Xorg (0x400000+0x74897) [0x474897] 5: /usr/bin/Xorg (0x400000+0x10dad3) [0x50dad3] 6: /lib64/libpthread.so.0 (0x31b4800000+0xf4c0) [0x31b480f4c0] 7: /lib64/libc.so.6 (0x31b4400000+0xf0dce) [0x31b44f0dce] 8: /lib64/libc.so.6 (0x31b4400000+0x7c1f8) [0x31b447c1f8] 9: /lib64/libc.so.6 (__libc_malloc+0x62) [0x31b4479af2] 10: /lib64/libc.so.6 (0x31b4400000+0x6fdbb) [0x31b446fdbb] 11: /lib64/libc.so.6 (0x31b4400000+0x75736) [0x31b4475736] 12: /lib64/libc.so.6 (0x31b4400000+0x78e78) [0x31b4478e78] 13: /lib64/libc.so.6 (__libc_malloc+0x6d) [0x31b4479afd] 14: /usr/bin/Xorg (miRegionCreate+0x23) [0x454be3] 15: /usr/bin/Xorg (miRectsToRegion+0x33) [0x455e43] 16: /usr/bin/Xorg (miChangeClip+0x8e) [0x55491e] 17: /usr/lib64/xorg/modules/libexa.so (0x7f0c42287000+0x2c6d) [0x7f0c42289c6d] 18: /usr/bin/Xorg (0x400000+0xd42b4) [0x4d42b4] 19: /usr/bin/Xorg (SetClipRects+0xbf) [0x4368ef] 20: /usr/bin/Xorg (0x400000+0x297a6) [0x4297a6] 21: /usr/bin/Xorg (0x400000+0x2ab5c) [0x42ab5c] 22: /usr/bin/Xorg (0x400000+0x21ffa) [0x421ffa] 23: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x31b441ec5d] 24: /usr/bin/Xorg (0x400000+0x21bb9) [0x421bb9] Expected results: Should continue to operate normally Additional info: --- Additional comment from firstname.lastname@example.org on 2010-07-26 14:36:59 EDT --- nouveau - nVidia Corporation G96 [Quadro FX 580] (rev a1) --- Additional comment from email@example.com on 2010-07-26 14:42:38 EDT --- Since this issue was entered in bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release. --- Additional comment from firstname.lastname@example.org on 2010-07-26 14:42:56 EDT --- Created an attachment (id=434494) xorg log --- Additional comment from email@example.com on 2010-07-26 14:57:38 EDT --- This issue has been proposed when we are only considering blocker issues in the current Red Hat Enterprise Linux release. ** If you would still like this issue considered for the current release, ask your support representative to file as a blocker on your behalf. Otherwise ask that it be considered for the next Red Hat Enterprise Linux release. **
Comment 2 Adam Jackson 2010-07-27 17:05:15 UTC
The backtrace above shows malloc calling __libc_message(abort=1) which calls malloc. This deadlocks. That's awful. Hung processes are worse than crashed processes. Either this needs to use some statically allocated bit of memory in .bss, or use sbrk() and just cope with leaking if the app is insane enough to trap SIGABRT and try to carry on.
Comment 3 Adam Jackson 2010-07-27 17:05:47 UTC
Reassigning back to glibc. I didn't change component for nothing.
Comment 4 Andreas Schwab 2010-07-28 15:40:27 UTC
This has nothing to do with catching SIGABRT but with having the abort message visible in coredumps.
Comment 6 Adam Jackson 2010-07-29 14:28:29 UTC
(In reply to comment #4) > This has nothing to do with catching SIGABRT but with having the abort message > visible in coredumps. If the app catches SIGABRT, it may continue instead of exiting. If it does, and then the abort is raised _again_, we try to free() the old message. That's what I meant by "use sbrk() and just leak"; we could allocate the storage for the crash message with sbrk(), but we'd have no way of freeing it. We could call into some internal bit of malloc that assumes the lock has already been taken, but that's dangerous, we're already at this point _because_ malloc's bookkeeping is corrupted. We could use alloca, but then you'd get no record of the crash if the app does catch SIGABRT. We could use mmap, but then you'd leak maps instead of leaking heap. Or we could use a static buffer in .bss, but then that's additional data space in every process. But really, at this point in a process' death throes, who cares. Allocate with sbrk because it's easy. Anyone trying to survive from a SIGABRT is already in a state of sin.
Comment 12 Andreas Schwab 2011-01-10 09:23:13 UTC
*** Bug 664365 has been marked as a duplicate of this bug. ***
Comment 20 Eric Bachalo 2011-02-24 21:57:11 UTC
Moving to RHEL 6.2 release, as no fix is upstream yet. This will need to be fixed upstream before it is considered for a RHEL release.