Description of problem: libpthread.so : pthread_mutex_lock call hangs rpm -qif /lib/libpthread.so.0 Name : glibc Relocations: (not relocateable) Version : 2.2.4 Vendor: Red Hat, Inc. Release : 29.1 Build Date: Wed 07 Aug 2002 08:19:59 AM EDT Install date: Thu 10 Oct 2002 06:55:31 PM EDT Build Host: stripples.devel.redhat.com Group : System Environment/Libraries Source RPM: glibc-2.2.4-29.1.src.rpm Size : 18113277 License: LGPL Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Summary : The GNU libc libraries. The program hangs on lock. When I attached to the process through gdb, Program received signal SIGINT, Interrupt. 0x40ee5cb5 in sigsuspend () from /lib/libc.so.6 (gdb) where #0 0x40ee5cb5 in sigsuspend () from /lib/libc.so.6 #1 0x40e28c19 in pthread_kill_other_threads_np () from /lib/libpthread.so.0 #2 0x40e2aec9 in sem_destroy () from /lib/libpthread.so.0 #3 0x40e26dd6 in pthread_mutex_lock () from /lib/libpthread.so.0 #4 0x0809e53f in Synchronizable::lock (this=0x829b45c) at S/synchronizable.cpp:50 #5 0x413a7d07 in QuoteAgentViewData::setSSReply (this=0x829b458, ssreply=0x0) at S/quoteAgentViewData.h:58 #6 0x41273195 in QuoteAgent::synchronizeSSReplyObject (this=0x81d7540, subject=0xbffebb70, client=0xbffebb90, ss_id=-1) at S/quoteAgent.cpp:2169 #7 0x41271d34 in QuoteAgent::unsubscribe (this=0x81d7540, clients=@0xbffec1e0, subjects=@0xbffec1b0) at S/quoteAgent.cpp:1974 #8 0x0807cf49 in Rv6ServerProxy::batchunsubscribe (this=0x41131688, msg=@0xbffec330) at S/rv6_serverproxy.cpp:929 #9 0x080780de in Rv6ServerProxy::processMsg (this=0x41131688, msg=@0xbffec330) at S/rv6_serverproxy.cpp:263 #10 0x08077b4e in Rv6ServerProxy::onMsg (this=0x41131688, listener=0x827bbb0, msg=@0xbffec330) at S/rv6_serverproxy.cpp:184 #11 0x080b8217 in TibrvMsgCallback::onEvent () at S/semaphore.cpp:95 #12 0x080b7bd4 in TibrvEvent::_listenCB () at S/semaphore.cpp:95 #13 0x40053420 in _tibrvQueue_Dispatch () from /local/rv72/lnx86- 24//lib/libtibrv.so #14 0x4005358c in tibrvQueue_TimedDispatch () from /local/rv72/lnx86- 24//lib/libtibrv.so #15 0x080b8b3d in TibrvQueue::dispatch () at S/semaphore.cpp:95 #16 0x08091223 in main (argc=7, argv=0xbffec544) at S/main.cpp:72 #17 0x40ed3757 in __libc_start_main () from /lib/libc.so.6 The default mutex is created (fast) I checked for possible deadlocks , etc , none to my knowledge. And the same code works fine all the time. Only when its run for a couple of days continuously it ends up with this hanging issue. Qns : 1. When will pthread_mutex_lock end up calling sem_destroy ? What are the conditions for this to occur ? 2. If you see sem_destroy ends up calling pthread_kill_other_threads_np which tries to kill all the threads in the process. Ideally I've seen pthread_kill_other_threads_np call only when the process runs out of resources/descriptors or soemthing. Why is this being called here ? 3. Any ptrs on known issues on pthread_mutex_lock or when does pthread_mutex_lock create the above stack trace will be helpful. Thanks, arun Version-Release number of selected component (if applicable): glibc 2.2.4 How reproducible: Happens whenever we run our prog. for a day or two. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
1) pthread_mutex_lock never calls sem_destroy, the backtrace can't be truested 2) neither sem_destroy calls pthread_kill_other_threads_np If you manage to create a small self-contained testcase which points to a glibc bug (as opposed to broken application locking which is much more likely), please reopen this bug. Without it there is really nothing we can do for you.