Description of problem: Running "pulp/playpen/perftests/loop_sync_cancel.sh" and noticing on some runs pulp appears hung. Below are the stack traces from 'debugging'. Issue is that python logging uses a lock. One thread is logging a statement, while it is performing a release of the lock we interrupt with a CancelException. Next thread tries to log something and it's hung on acquiring the lock. The timing of this is difficult, I don't see this scenario a lot when running loop_cancel_sync, but I have seen it a few times over several days of testing. Thread was in the process of releasing lock and was interrupted with a CancelException [Fri May 06 08:36:34 2011] [error] Exception in thread Thread-486: [Fri May 06 08:36:34 2011] [error] Traceback (most recent call last): [Fri May 06 08:36:34 2011] [error] File "/usr/lib64/python2.7/threading.py", line 530, in __bootstrap_inner [Fri May 06 08:36:34 2011] [error] self.run() [Fri May 06 08:36:34 2011] [error] File "/usr/lib64/python2.7/threading.py", line 483, in run [Fri May 06 08:36:34 2011] [error] self.__target(*self.__args, **self.__kwargs) [Fri May 06 08:36:34 2011] [error] File "/shared/repo/pulp/src/pulp/server/tasking/task.py", line 343, in run [Fri May 06 08:36:34 2011] [error] _log.error("Task id:%s, method_name:%s CancelException" % (self.id, self.method_name)) [Fri May 06 08:36:34 2011] [error] File "/usr/lib64/python2.7/logging/__init__.py", line 1154, in error [Fri May 06 08:36:34 2011] [error] self._log(ERROR, msg, args, **kwargs) [Fri May 06 08:36:34 2011] [error] File "/usr/lib64/python2.7/logging/__init__.py", line 1246, in _log [Fri May 06 08:36:34 2011] [error] self.handle(record) [Fri May 06 08:36:34 2011] [error] File "/usr/lib64/python2.7/logging/__init__.py", line 1256, in handle [Fri May 06 08:36:34 2011] [error] self.callHandlers(record) [Fri May 06 08:36:34 2011] [error] File "/usr/lib64/python2.7/logging/__init__.py", line 1293, in callHandlers [Fri May 06 08:36:34 2011] [error] hdlr.handle(record) [Fri May 06 08:36:34 2011] [error] File "/usr/lib64/python2.7/logging/__init__.py", line 742, in handle [Fri May 06 08:36:34 2011] [error] self.release() [Fri May 06 08:36:34 2011] [error] File "/usr/lib64/python2.7/logging/__init__.py", line 696, in release [Fri May 06 08:36:34 2011] [error] self.lock.release() [Fri May 06 08:36:34 2011] [error] File "/usr/lib64/python2.7/threading.py", line 138, in release [Fri May 06 08:36:34 2011] [error] self.__count = count = self.__count - 1 [Fri May 06 08:36:34 2011] [error] CancelException Thread blocked trying to acquire lock [Fri May 06 08:42:58 2011] [error] # ProcessId: 11598 [Fri May 06 08:42:58 2011] [error] # ThreadID: 140472190457600 [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/threading.py", line 503, in __bootstrap [Fri May 06 08:42:58 2011] [error] self.__bootstrap_inner() [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/threading.py", line 530, in __bootstrap_inner [Fri May 06 08:42:58 2011] [error] self.run() [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/threading.py", line 483, in run [Fri May 06 08:42:58 2011] [error] self.__target(*self.__args, **self.__kwargs) [Fri May 06 08:42:58 2011] [error] File: "/shared/repo/pulp/src/pulp/server/tasking/taskqueue/queue.py", line 116, in _dispatch [Fri May 06 08:42:58 2011] [error] self._cancel_tasks() [Fri May 06 08:42:58 2011] [error] File: "/shared/repo/pulp/src/pulp/server/tasking/taskqueue/queue.py", line 158, in _cancel_tasks [Fri May 06 08:42:58 2011] [error] task.cancel() [Fri May 06 08:42:58 2011] [error] File: "/shared/repo/pulp/src/pulp/server/tasking/repo_sync_task.py", line 48, in cancel [Fri May 06 08:42:58 2011] [error] super(RepoSyncTask, self).cancel() [Fri May 06 08:42:58 2011] [error] File: "/shared/repo/pulp/src/pulp/server/tasking/task.py", line 417, in cancel [Fri May 06 08:42:58 2011] [error] self.thread.cancel() [Fri May 06 08:42:58 2011] [error] File: "/shared/repo/pulp/src/pulp/server/tasking/taskqueue/thread.py", line 243, in cancel [Fri May 06 08:42:58 2011] [error] self.raise_exception(CancelException) [Fri May 06 08:42:58 2011] [error] File: "/shared/repo/pulp/src/pulp/server/tasking/taskqueue/thread.py", line 231, in raise_exception [Fri May 06 08:42:58 2011] [error] deliver_exception(self, test_exception_event, self.__exception_event.wait) [Fri May 06 08:42:58 2011] [error] File: "/shared/repo/pulp/src/pulp/server/tasking/taskqueue/thread.py", line 215, in deliver_exception [Fri May 06 08:42:58 2011] [error] _raise_exception_in_thread(_tid(thread), exc_type) [Fri May 06 08:42:58 2011] [error] File: "/shared/repo/pulp/src/pulp/server/tasking/taskqueue/thread.py", line 165, in _raise_exception_in_thread [Fri May 06 08:42:58 2011] [error] _log.error("in _raise_exception_in_thread. long_tid = [%s], num = [%s]" % (long_tid, num)) [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/logging/__init__.py", line 1154, in error [Fri May 06 08:42:58 2011] [error] self._log(ERROR, msg, args, **kwargs) [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/logging/__init__.py", line 1246, in _log [Fri May 06 08:42:58 2011] [error] self.handle(record) [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/logging/__init__.py", line 1256, in handle [Fri May 06 08:42:58 2011] [error] self.callHandlers(record) [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/logging/__init__.py", line 1293, in callHandlers [Fri May 06 08:42:58 2011] [error] hdlr.handle(record) [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/logging/__init__.py", line 738, in handle [Fri May 06 08:42:58 2011] [error] self.acquire() [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/logging/__init__.py", line 689, in acquire [Fri May 06 08:42:58 2011] [error] self.lock.acquire() [Fri May 06 08:42:58 2011] [error] File: "/usr/lib64/python2.7/threading.py", line 122, in acquire [Fri May 06 08:42:58 2011] [error] rc = self.__block.acquire(blocking)
Closing as this has been rewritten in v2. If it becomes an issue in the new code base we can refile it.