Bug 132553

Summary: All threads are waiting for mutex libc_free, non of them are obtaining it
Product: Red Hat Enterprise Linux 2.1 Reporter: Murat Berk <murat_berk>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 2.1CC: drepper, fweimer
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-08-18 09:31:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Murat Berk 2004-09-14 18:34:44 UTC
Description of problem:
We have a multi threaded server which is a heavy user of malloc/free. 
Usuaully we do not have any problems for a long time. But at certain 
times, most of our threads goes into this state where they waiting a 
mutex but they do not have the mutex. I cannot reproduce at will, but 
we can make happen sometimes in couple of hours, sometimes in a day 
or two. It seems it is happening mostly when we do calls in many 
threads against nss subsystem, i.e at the time when we verify 
passwords for many clients and run tough group entries to find the 
groups the user belongs to. 

We can capture this in debugger and I collected as much information 
as I can. If you need more, we need to wait for it happen again so I 
can get some more. All the details are in the additional info area.


Version-Release number of selected component (if applicable):
glibc-2.2.4-32.17 (latest updates on RH AS 2.1 using up2date)

How reproducible:
Every time

Steps to Reproduce:
1.Run the application for a while and force all clients reconnect at 
the same time
2.
3.
  
Actual results:
Application deadlocks and cannot do anything else after a while

Expected results:


Additional info:

Here is full stack trace for all threads, contents of the *mutex and 
the values for the next_waiting object list as much as I can capture.

It looks like 5 threads are contented for the malloc/free lock.
4 of them, thread 3, 4, 5, 7 are on the wait node structures.
Only thread 1 is not there. And it was like this last time it
deadlocked also. The one which is not waking up is the one which
is doing nss calls...


RH AS 2.1 SMP kernel with all fixes aplied
   
[root@houperflx117 linux-2-4-x86]# ps -ef | grep cserver
root     27814     1  0 06:40 pts/6    00:00:22 ./cserver <= main 
thread
patqa1   27816 27814  0 06:40 pts/6    00:00:00 ./cserver <= manager 
thread
patqa1   27817 27816  0 06:40 pts/6    00:00:03 ./cserver
patqa1   27821 27816  0 06:41 pts/6    00:00:12 ./cserver
patqa1   27822 27816  0 06:41 pts/6    00:00:00 ./cserver
patqa1   27823 27816  0 06:41 pts/6    00:00:00 ./cserver
patqa1   27824 27816  0 06:41 pts/6    00:00:00 ./cserver
patqa1   27826 27816  0 06:41 pts/6    00:00:00 ./cserver
patqa1   27827 27816  0 06:41 pts/6    00:00:00 ./cserver
root     28199 28128  0 08:07 pts/7    00:00:00 grep cserver

Root is due to we switch root before we can read shadow password 
entries etc...

(gdb) p/x *mutex
$6 = {__m_reserved = 0x0, __m_count = 0x0, __m_owner = 0x0, __m_kind 
= 0x0, __m_lock = {__status = 0x4092157c, __spinlock = 0x0}}

(gdb) x/10x 0x4092157c <== from _status in *mutex
0x4092157c:     0x4197e6bc      0x40921be0      0x00000000      
0x40063e2e
0x4092158c:     0x4092164c      0x00000002      0x409215b4      
0x400699b8
0x4092159c:     0x41d79b28      0x41d00420
(wait node belongs thread 3)

(gdb) x/10x 0x4197e6bc <== from _next_node on thread 3 wait node
0x4197e6bc:     0x41a7e05c      0x4197ebe0      0x00000000      
0x40060dac
0x4197e6cc:     0x0909efc0      0x401e42b4      0x4197e704      
0x400699b8
0x4197e6dc:     0x42e766d8      0x41d00420
(wait node belongs thread 4)

(gdb) x/10x 0x41a7e05c <== from _next_node on thread 4 wait node
0x41a7e05c:     0x41c81000      0x41a7fbe0      0x00000000      
0x41a7fbe0
0x41a7e06c:     0x41e00010      0x434bfa30      0x41a7e094      
0x400699b8
0x41a7e07c:     0x437e54d8      0x41d00420
(wait node belongs thread 5)

(gdb) x/10x 0x41c81000 <== from _next_node on thread 5 wait node
0x41c81000:     0x00000001      0x41c81be0      0x00000000      
0x40060e04
0x41c81010:     0x401e2ba0      0x401e42b4      0x41c81048      
0x400699b8
0x41c81020:     0x437980a0      0x41d00420
(wait node belongs thread 7)

Why is the next node is not NULL but 0x00000001?

(gdb) thread apply all bt

Thread 9 (Thread 57352 (LWP 27827)):
#0  0x400dbbe5 in __sigsuspend (set=0x45600844) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x400622b9 in __pthread_wait_for_restart_signal (self=0x45600be0) 
at pthread.c:1027
#2  0x4005ebdc in pthread_cond_wait (cond=0x972f8e8, mutex=0x972f8b8) 
at restart.h:34
#3  0x4074006f in TSS_ConditionVariable::wait () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#4  0x405cd187 in GenTQRep::getMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#5  0x405ce4e5 in OSS_Gen_ThreadQ::getMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#6  0x403ee434 in Cos_RootObject::runMainLoop () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#7  0x417bcb7d in LTBackendProc () 
from /data1/BMC/ds_client/Patrol7/bin/linux-2-4-x86/liblayout_t.so.7.3
#8  0x407403c5 in tss_isThreadEqual () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#9  0x4005fc6f in pthread_start_thread (arg=0x45600be0) at 
manager.c:279
#10 0x4019acea in thread_start () from /lib/i686/libc.so.6

Thread 8 (Thread 49159 (LWP 27826)):
#0  0x400dbbe5 in __sigsuspend (set=0x44339814) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x400622b9 in __pthread_wait_for_restart_signal (self=0x44339be0) 
at pthread.c:1027
#2  0x4005ebdc in pthread_cond_wait (cond=0x966d1b8, mutex=0x966d188) 
at restart.h:34
#3  0x4074006f in TSS_ConditionVariable::wait () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#4  0x405cd187 in GenTQRep::getMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#5  0x405ce4e5 in OSS_Gen_ThreadQ::getMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#6  0x403ee434 in Cos_RootObject::runMainLoop () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#7  0x40decc98 in PslTask_Task::setVmContext () 
from /data1/BMC/ds_client/Patrol7/bin/linux-2-4-
x86/libpsltask_t.so.7.3
#8  0x407403c5 in tss_isThreadEqual () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#9  0x4005fc6f in pthread_start_thread (arg=0x44339be0) at 
manager.c:279
#10 0x4019acea in thread_start () from /lib/i686/libc.so.6

Thread 7 (Thread 40966 (LWP 27824)):
#0  0x400dbbe5 in __sigsuspend (set=0x41c80f20) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x400622b9 in __pthread_wait_for_restart_signal (self=0x41c81be0) 
at pthread.c:1027
#2  0x40064029 in __pthread_alt_lock (lock=0x41d00430, self=0x0) at 
restart.h:34
#3  0x40060d36 in __pthread_mutex_lock (mutex=0x41d00420) at 
mutex.c:120
#4  0x4012e418 in __libc_free (mem=0x437980a8) at malloc.c:3153
#5  0x407025d7 in __builtin_delete () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libmetakit2_t.so.7
#6  0x407235ca in bmc_deallocate () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#7  0x4072a3c0 in bmc_string16::_reserve () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#8  0x4072a402 in bmc_string16::reserve () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#9  0x4072ae2c in bmc_string16::replace () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#10 0x4072adb7 in bmc_string16::replace () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#11 0x40586706 in _CatalogInfo::findOrOpenCatalog () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#12 0x4064144c in OSS_FindCatCB::callbackHandler () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#13 0x405be68b in OSS_SearchPath::scan () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#14 0x405882ae in OSS_Catalog::getMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#15 0x4058b519 in OSS_Catalog::getMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#16 0x416cdfff in LTBackendProc () 
from /data1/BMC/ds_client/Patrol7/bin/linux-2-4-x86/liblayout_t.so.7.3
#17 0x416cb75a in LTBackendProc () 
from /data1/BMC/ds_client/Patrol7/bin/linux-2-4-x86/liblayout_t.so.7.3
#18 0x416b9f5a in LTBackendProc () 
from /data1/BMC/ds_client/Patrol7/bin/linux-2-4-x86/liblayout_t.so.7.3
#19 0x416b9701 in LTBackendProc () 
from /data1/BMC/ds_client/Patrol7/bin/linux-2-4-x86/liblayout_t.so.7.3
#20 0x416f6c5b in LTBackendProc () 
from /data1/BMC/ds_client/Patrol7/bin/linux-2-4-x86/liblayout_t.so.7.3
#21 0x4042cb46 in Cos_StandardObject::executeAsync () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#22 0x403f2ecc in Cos_RootObject::executeAsync () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#23 0x403fad64 in _Cos_ServicesObject::_handleRPCRequestMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#24 0x403ffb05 in _Cos_ServicesObject::_decodeCosMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#25 0x403c6155 in Cos_IPCMessage::execute () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#26 0x405ce587 in OSS_Gen_ThreadQ::dispatchMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#27 0x403a5e8d in _Cos_ThreadPoolMember::threadProc () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#28 0x407403c5 in tss_isThreadEqual () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#29 0x4005fc6f in pthread_start_thread (arg=0x41c81be0) at 
manager.c:279
#30 0x4019acea in thread_start () from /lib/i686/libc.so.6

Thread 6 (Thread 32773 (LWP 27823)):
#0  0x400dbbe5 in __sigsuspend (set=0x41b7f13c) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x400622b9 in __pthread_wait_for_restart_signal (self=0x41b80be0) 
at pthread.c:1027
#2  0x40065554 in __pthread_rwlock_wrlock (rwlock=0x80eb048) at 
restart.h:34
#3  0x407409f8 in TSS_RwMutex::writeLock () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#4  0x40581b4e in OSS_AccountCache::add () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#5  0x40230aad in Agent_AuthenticationScheme::appendCookieFields ()
   from /data1/BMC/ds_client/common/bmc/bin/linux-2-4-
x86/libagentlib_t.so.7.3
#6  0x4025a408 in Agent_MLMChallenger::checkPasswordAsyn ()
   from /data1/BMC/ds_client/common/bmc/bin/linux-2-4-
x86/libagentlib_t.so.7.3
#7  0x402c6613 in Auth_PasswordChallenger::handleResponseAsyn () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libauth_t.so.7
#8  0x402c9843 in Auth_Scheme::dispatchResponseAsyn () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libauth_t.so.7
#9  0x402c9546 in Auth_Scheme::handleResponseAsync () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libauth_t.so.7
#10 0x403b1786 in Cos_AuthServer::authenticate () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#11 0x403f9640 in _Cos_ServicesObject::_handleRPCRequestMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#12 0x403ffb05 in _Cos_ServicesObject::_decodeCosMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#13 0x403c6155 in Cos_IPCMessage::execute () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#14 0x405ce587 in OSS_Gen_ThreadQ::dispatchMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#15 0x403a5e8d in _Cos_ThreadPoolMember::threadProc () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#16 0x407403c5 in tss_isThreadEqual () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#17 0x4005fc6f in pthread_start_thread (arg=0x41b80be0) at 
manager.c:279
#18 0x4019acea in thread_start () from /lib/i686/libc.so.6

Thread 5 (Thread 24580 (LWP 27822)):
#0  0x400dbbe5 in __sigsuspend (set=0x41a7df7c) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x400622b9 in __pthread_wait_for_restart_signal (self=0x41a7fbe0) 
at pthread.c:1027
#2  0x40064029 in __pthread_alt_lock (lock=0x41d00430, self=0x0) at 
restart.h:34
#3  0x40060d36 in __pthread_mutex_lock (mutex=0x41d00420) at 
mutex.c:120
#4  0x4012e418 in __libc_free (mem=0x437e54e0) at malloc.c:3153
#5  0x407025d7 in __builtin_delete () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libmetakit2_t.so.7
#6  0x407235ca in bmc_deallocate () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#7  0x4072a103 in bmc_string16::~bmc_string16 () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#8  0x406ba16f in I18n_String::~I18n_String () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libi18n_t.so.7
#9  0x40582660 in OSS_AccountRep::~OSS_AccountRep () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#10 0x40640331 in void oss_unref<OSS_AccountRep> () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#11 0x405805b1 in OSS_Account::~OSS_Account () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#12 0x40581be1 in OSS_AccountCache::add () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#13 0x40230aad in Agent_AuthenticationScheme::appendCookieFields ()
   from /data1/BMC/ds_client/common/bmc/bin/linux-2-4-
x86/libagentlib_t.so.7.3
#14 0x4025a408 in Agent_MLMChallenger::checkPasswordAsyn ()
   from /data1/BMC/ds_client/common/bmc/bin/linux-2-4-
x86/libagentlib_t.so.7.3
#15 0x402c6613 in Auth_PasswordChallenger::handleResponseAsyn () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libauth_t.so.7
#16 0x402c9843 in Auth_Scheme::dispatchResponseAsyn () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libauth_t.so.7
#17 0x402c9546 in Auth_Scheme::handleResponseAsync () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libauth_t.so.7
#18 0x403b1786 in Cos_AuthServer::authenticate () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#19 0x403f9640 in _Cos_ServicesObject::_handleRPCRequestMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#20 0x403ffb05 in _Cos_ServicesObject::_decodeCosMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#21 0x403c6155 in Cos_IPCMessage::execute () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#22 0x405ce587 in OSS_Gen_ThreadQ::dispatchMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#23 0x403a5e8d in _Cos_ThreadPoolMember::threadProc () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#24 0x407403c5 in tss_isThreadEqual () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#25 0x4005fc6f in pthread_start_thread (arg=0x41a7fbe0) at 
manager.c:279
#26 0x4019acea in thread_start () from /lib/i686/libc.so.6

Thread 4 (Thread 16387 (LWP 27821)):
#0  0x400dbbe5 in __sigsuspend (set=0x4197e5dc) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x400622b9 in __pthread_wait_for_restart_signal (self=0x4197ebe0) 
at pthread.c:1027
#2  0x40064029 in __pthread_alt_lock (lock=0x41d00430, self=0x0) at 
restart.h:34
#3  0x40060d36 in __pthread_mutex_lock (mutex=0x41d00420) at 
mutex.c:120
#4  0x4012e418 in __libc_free (mem=0x42e766e0) at malloc.c:3153
#5  0x407025d7 in __builtin_delete () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libmetakit2_t.so.7
#6  0x407235ca in bmc_deallocate () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#7  0x4072a103 in bmc_string16::~bmc_string16 () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#8  0x40439a0b in Cos_Value::_destructStringValue () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#9  0x4043a03e in Cos_Value::release () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#10 0x403a8909 in bmc_array<Cos_Attribute>::_unref () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#11 0x403a8b81 in bmc_array<Cos_Attribute>::~bmc_array () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#12 0x403a9db0 in Cos_AttributeSet::~Cos_AttributeSet () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#13 0x40446875 in Cos_Event::~Cos_Event () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#14 0x40434a4b in Cos_SubscriptionTable::_flushImpl () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#15 0x403c22da in Cos_EventFlushToken::execute () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#16 0x405ce587 in OSS_Gen_ThreadQ::dispatchMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#17 0x403a5e8d in _Cos_ThreadPoolMember::threadProc () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#18 0x407403c5 in tss_isThreadEqual () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#19 0x4005fc6f in pthread_start_thread (arg=0x4197ebe0) at 
manager.c:279
#20 0x4019acea in thread_start () from /lib/i686/libc.so.6

Thread 3 (Thread 8194 (LWP 27817)):
#0  0x400dbbe5 in __sigsuspend (set=0x4092149c) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x400622b9 in __pthread_wait_for_restart_signal (self=0x40921be0) 
at pthread.c:1027
#2  0x40064029 in __pthread_alt_lock (lock=0x41d00430, self=0x0) at 
restart.h:34
#3  0x40060d36 in __pthread_mutex_lock (mutex=0x41d00420) at 
mutex.c:120
#4  0x4012e418 in __libc_free (mem=0x41d79b30) at malloc.c:3153
#5  0x407025d7 in __builtin_delete () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libmetakit2_t.so.7
#6  0x407235ca in bmc_deallocate () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#7  0x4045986e in bmc_hash_table<int, Cos_Callback *>::_putBucket () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#8  0x4045989e in bmc_hash_table<int, Cos_Callback *>::_deleteBucket 
() from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#9  0x40723a77 in bmc_hash_table_base::_erase () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libbmc_t.so.7
#10 0x40459db5 in bmc_hash_table<int, Cos_Callback *>::erase () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#11 0x4040e29e in _Cos_ServicesObject::clearTimedOutCallbacks () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#12 0x403c1c34 in Cos_EventFlusher::execute () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#13 0x405c903a in OSS_TaskManager::_dispatchTask () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#14 0x405cb213 in OSS_TimerTaskManager::_doTasks () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#15 0x405c8fd5 in OSS_TaskManager::doTasks () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#16 0x405c9f8d in OSS_TaskMonitor::run () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#17 0x405ca37d in OSS_TaskMonitor::run () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#18 0x40396044 in Cos_CommThread::threadProc () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#19 0x407403c5 in tss_isThreadEqual () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libtss_t.so.7
#20 0x4005fc6f in pthread_start_thread (arg=0x40921be0) at 
manager.c:279
#21 0x4019acea in thread_start () from /lib/i686/libc.so.6

Thread 2 (Thread 16385 (LWP 27816)):
#0  0x40193487 in __poll (fds=0x80ee034, nfds=1, timeout=2000) 
at ../sysdeps/unix/sysv/linux/poll.c:63
#1  0x4005f920 in __pthread_manager (arg=0x8) at manager.c:135
#2  0x4019acea in thread_start () from /lib/i686/libc.so.6

Thread 1 (Thread 8192 (LWP 27814)):
#0  0x400dbbe5 in __sigsuspend (set=0xbffe8550) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x400622b9 in __pthread_wait_for_restart_signal (self=0x40067020) 
at pthread.c:1027
#2  0x40064029 in __pthread_alt_lock (lock=0x41d00430, self=0x0) at 
restart.h:34
#3  0x40060d36 in __pthread_mutex_lock (mutex=0x41d00420) at 
mutex.c:120
#4  0x4012e418 in __libc_free (mem=0x42aa1980) at malloc.c:3153
#5  0x401b7686 in clntudp_destroy (cl=0x42aa1980) at clnt_udp.c:606
#6  0x40ad52d4 in do_ypcall (domain=0x40ae3400 "bmc.com", prog=3, 
xargs=0x40ad4500 <xdr_ypreq_key>, req=0xbffe8770 "",
    xres=0x40ad45d0 <xdr_ypresp_val>, resp=0xbffe8760 "ÿÿÿÿ") at 
ypclnt.c:217
#7  0x40ad550e in yp_match (indomain=0x40ae3400 "bmc.com", 
inmap=0x40aef904 "shadow.byname", inkey=0x437ae878 "patqa1",
    inkeylen=6, outval=0xbffe87d0, outvallen=0xbffe87d4) at 
ypclnt.c:430
#8  0x40aedbd0 in _nss_nis_getspnam_r (name=0x437ae878 "patqa1", 
sp=0xbffe9858,
    buffer=0xbffe8858 "/PAT0!s\tAGENT_HOUPERFSUN,\210_¿", 
buflen=4096, errnop=0x401e4f20) at nss_nis/nis-spwd.c:179
#9  0x4019f650 in __old_getspnam_r (name=0x437ae878 "patqa1", 
resbuf=0xbffe9858,
    buffer=0xbffe8858 "/PAT0!s\tAGENT_HOUPERFSUN,\210_¿", 
buflen=4096, result=0xbffe8854) at ../nss/getXXbyYY_r.c:200
#10 0x40ab67d2 in BAA_ImportLoginUser () 
from /data1/BMC/ds_client/common/security/bin/linux-2-4-x86/bmcauth.so
#11 0x405d4791 in BAA_LoginUser () from /opt/bmc/common/bmc/bin/linux-
2-2-x86/liboss_t.so.7
#12 0x405d3232 in OSS_BaaAuthObject::checkPassword () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#13 0x405d2d99 in OSS_BaaAuthObject::validateUser () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#14 0x4058520d in OSS_AuthObject::validateUser () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#15 0x402309ef in Agent_AuthenticationScheme::appendCookieFields ()
   from /data1/BMC/ds_client/common/bmc/bin/linux-2-4-
x86/libagentlib_t.so.7.3
#16 0x4025a408 in Agent_MLMChallenger::checkPasswordAsyn ()
   from /data1/BMC/ds_client/common/bmc/bin/linux-2-4-
x86/libagentlib_t.so.7.3
#17 0x402c6613 in Auth_PasswordChallenger::handleResponseAsyn () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libauth_t.so.7
#18 0x402c9843 in Auth_Scheme::dispatchResponseAsyn () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libauth_t.so.7
#19 0x402c9546 in Auth_Scheme::handleResponseAsync () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libauth_t.so.7
#20 0x403b1786 in Cos_AuthServer::authenticate () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#21 0x403f9640 in _Cos_ServicesObject::_handleRPCRequestMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#22 0x403ffb05 in _Cos_ServicesObject::_decodeCosMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#23 0x403c6155 in Cos_IPCMessage::execute () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#24 0x405ce587 in OSS_Gen_ThreadQ::dispatchMessage () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/liboss_t.so.7
#25 0x403ee421 in Cos_RootObject::runMainLoop () 
from /opt/bmc/common/bmc/bin/linux-2-2-x86/libcos_t.so.7
#26 0x40225438 in Agent::agentRun () 
from /data1/BMC/ds_client/common/bmc/bin/linux-2-4-
x86/libagentlib_t.so.7.3
#27 0x40257592 in main_entry () 
from /data1/BMC/ds_client/common/bmc/bin/linux-2-4-
x86/libagentlib_t.so.7.3
#28 0x0804f0e4 in ?? ()
#29 0x0804f1e3 in ?? ()
#30 0x0804d4d1 in ?? ()
#31 0x400c9687 in __libc_start_main (main=0x804d498, argc=1, 
ubp_av=0xbffec514, init=0x804c8bc, fini=0x80be9f0,
    rtld_fini=0x4000dda4 <_dl_fini>, stack_end=0xbffec50c) 
at ../sysdeps/generic/libc-start.c:129
#32 0x0804d3f1 in ?? ()

Comment 1 Ulrich Drepper 2005-07-25 22:39:01 UTC
There really isn't any information in the report which can help solving the
issue.  And I don't know about any other reports like this.

My guess is that there has been previously a memory corruption of some sort of a
malloc function has been used incorrectly (called from a signal handler, double
free, ...).

Did you run your program code using a memory handling debugger?  Either with
some of the simple-minded once like dmalloc and EFence which come with RHEL.  Or
valgrind on RHEL4?  Also, can this be reproduced on RHEL3 or RHEL4?

Comment 2 Jakub Jelinek 2005-08-18 09:31:51 UTC
Without a self-contained testcase I think we can't move with this.
If you create a reproducer, please reopen with the attachment.

Comment 3 Ravi Prakash 2006-04-05 04:13:48 UTC
Hi,
Though I cannot give you any self-contained testcase, but I also see exactly
same problem on
2.4.9-e.34enterprise #1 SMP Wed Dec 10 16:42:39 i686
I see this problem in following senario -
  I do a longjmp from within a signal handler (SIGALRM) and it takes quite
  long for the jump to actualy happen. The stack trace always shows
   pthread_mutex_lock wait (as reported in this bug) in libc_free routine.

thanks,
ravi

Comment 4 Ravi Prakash 2006-04-05 05:15:07 UTC
Hi,
I might have incorrectly stated the problem above as I have seen so
many different cases in last 2 weeks.

is it possible that following could be happening -
  1. When I call longjmp, a mutex is left unclosed by libc_free (Note in this
     mode, I am not using threads in my application)
  2. Longjump happens and when I try and cleanUp stuff after this, free finds
     a mutex and just waits for someone to open it for it.

Another problem that I saw while using threads was that after I issue 
pthread_cancel, I do a pthread_join on the canceled thread (or else I run into
libc mutex problem cited above) and then join takes forever - in one case
it took almost 2500s of real time.

thanks,
ravi