From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4a) Gecko/20030313 Description of problem: The thread code used on Linux (Python/thread_pthread.h) uses unnecessarily many locking operations in the acquire function. I'll attach a patch to change that. Version-Release number of selected component (if applicable): python-2.2.2-26 How reproducible: Always Steps to Reproduce: 1.Look at Python/thread_pthread.h function PyThread_acquire_lock 2. 3. Additional info:
Created attachment 90651 [details] Patch to remove unnecessary locking The effect is only really visible if there is contention.
Uli, what are the chances that this patch would work for any libc? Should I try to push it upstream? (__builtin_expect looks very glibc-specific, but I may be wrong too since I don't know much about threads).
If you leave out __builtin_expect (this is a gcc feature) it'll work everywhere.
After reading the code it looks to me like the patch will produce deadlocks. Let me explain my point, and please feel free to tell me I'm on crack. Python's serialization does not rely exclusively on a mutex. The reason is stated in a big comment block in thread_pthread.h: /* A pthread mutex isn't sufficient to model the Python lock type * because, according to Draft 5 of the docs (P1003.4a/D5), both of the * following are undefined: * -> a thread tries to lock a mutex it already has locked * -> a thread tries to unlock a mutex locked by a different thread * pthread mutexes are designed for serializing threads over short pieces * of code anyway, so wouldn't be an appropriate implementation of * Python's locks regardless. * * The pthread_lock struct implements a Python lock as a "locked?" bit * and a <condition, mutex> pair. In general, if the bit can be acquired * instantly, it is, else the pair is used to block the thread until the * bit is cleared. 9 May 1994 tim */ Wheater or not those assumptions are true or not, I don't know. Thing is, the locking mechanism is provided by a global flag protected by a mutex. Changing the flag's values requires acuiring the mutex and then releasing it. Now, if I read the patch correctly, the following scenario would be possible: - thread A acquires the global lock (thelock->locked == 1); the mutex is released. - thread B wants to acquire the lock, so it calls PyThread_acquire_lock - thread B acquires the mutex; because thelock->locked ==1, success == 0 - thread B enters the if part and then the while loop, and locks on pthread_cond_wait (while the mutex is still acquired) - thread A is done and calls PyThread_release_lock; it locks waiting on the mutex hold by B. A would call thread_cond_signal to unlock B, but it won't get there. Is this scenario possible?
> After reading the code it looks to me like the patch will produce deadlocks. Let > me explain my point, and please feel free to tell me I'm on crack. Your analysis is indeed wrong. > Python's serialization does not rely exclusively on a mutex. The reason is > stated in a big comment block in thread_pthread.h: And this hasn't changed. If you look at the possible code paths you'll see that only the case when the 'if' condition is true actually changed. If the '#if' expression is false nothing changed with the tiny exception that the 'if' expression is evaluated while holding the lock. The expression consists simply of tests of variables, and therefore has no possibility of a lockup. Now look at the case when the 'if' expression is true. In this case all we do is replace unlock if evaluation lock with if evaluation Again, the expression consists simply of variable comparisons, nno possible delay or lockup. The previously performed unlock/lock are pointless and just add more, unnecessary work.
Yes, you are right. Makes me wonder how come the deadlock is not possible in the current code. Let me think more about it, I may be missing something. python 2.3a seems to have a version that uses semaphores instead of mutexes, in addition to the current version (which has not changed).
OK, I missed this. pthread_cond_wait atomically unlocks the mutex (as per pthread_unlock_mutex) and waits for the condition variable cond to be signaled. The thread execution is suspended and does not consume any CPU time until the condition variable is signaled. The mutex must be locked by the calling thread on entrance to pthread_cond_wait. Before returning to the calling thread, pthread_cond_wait re-acquires mutex (as per pthread_lock_mutex). Now the code makes sense, and so does the patch. Thanks.
Patch sent upstream: https://sourceforge.net/tracker/?func=detail&aid=711835&group_id=5470&atid=305470
Fixed upstream.