Created attachment 397632 [details] source file that exhibits the problem. Description of problem: Multithreaded applications competing for a mutex often crash with: pthread_mutex_lock.c:87: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed. I've attached source code that can reproduce exact problem Version-Release number of selected component (if applicable): Environment 1: Fedora Core 10 2.6.27.24-170.2.68.fc10.i686.PAE #1 SMP Wed May 20 22:58:30 EDT 2009 gcc (GCC) 4.3.2 20081105 (Red Hat 4.3.2-7) glibc 2.9 i386 Environment 2: Fedora Core 11 2.6.29.4-167.fc11.i586 #1 SMP Wed May 27 17:14:37 EDT 2009 gcc (GCC) 4.4.1 20090725 (Red Hat 4.4.1-2) glibc 2.10.2 i686 How reproducible: Just put some pthreads competing for a lock on a default-initialized mutex. Crash occurs inside pthread_mutex_lock Steps to Reproduce: 1. compile attachment like this: $ g++ -g -Wall -Werror -pipe -O3 -Wno-deprecated break_pthreads.cpp -o break_pthreads.o -c $ g++ -g -pthread break_pthreads.o -o break_pthreads $ rm break_pthreads.o 2. run ./break_pthreads 3. in my fc10 and fc11 systems it takes less than a second for it to happen Actual results: break_pthreads: pthread_mutex_lock.c:62: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed. Aborted (core dumped) Expected results: Program should run forever Additional info: I've already read: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=479952 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29415 those links seem to be related.
The program has undefined behaviour. Use PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP.
Can you please explain why it has undefined behaviour? is it ok to use PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP in production code?
The code locks the mutex once and then unlocks it twice in some code paths, you can't unlock an unlocked mutex. It is ok to use error checking mutexes in production code, just it will be slower than the normal ones. With error checking mutexes the second pthread_mutex_unlock will just fail with EPERM. Much better if you just fix the bug.
Thanks a lot Jakub! Production code is totally different but incurs in the same issue. I think I'm gonna put some wrapper calls around pthread_mutex_lock and pthread_mutex_unlock so I can trace better any path with their return values. Regards!