Description of problem: python-pthreading 1.3 introduced a change which is not compatible with vdsm. When vdsm is monitoring the mailbox and a read operation fail, vdsm try to log the error. However the code is ensuring first that a lock is released by calling the locked() method of the lock object. Unfortunately, pthreading.Lock does not implement this method, which cause a new exception to be raised, causing the mailbox thread to exit. From this point, disk extend request sent over the mailbox are not handled, leading to pause of vms when their disk become full. This issue in pthreading exists since the first version - it never implemented the locked() method, probably because it is not well documented. The error was hidden by the fact that mailbox does not use threading.Lock, which is replaced by pthreading.Lock when using pthreading, but thread.allocate_lock. In version 1.3, pthreading started to replace also thread.allocate_lock with pthreading.Lock, causing mailbox code to fail. Version-Release number of selected component (if applicable): python-pthreading-0.1.3-1.el6ev.noarch How reproducible: Always Steps to Reproduce: 1. Wait until there is io error on the master domain and mailbox monitor thread exit 2. Watch vms pause Workaround: Downgrading python-pthreading to version python-pthreading-0.1.2-1.el6ev.noarch eliminates this issue.
The vdsm patch eliminate this error in vdsm even with the broken pthreading version. Not setting this to POST since pthreading fix is required to support older vedsm version anyway.
The pthreading patch is already merge in upstream, and we don't have a downstream branch for this project, so this can probably be MODIFIED now.
(In reply to Nir Soffer from comment #4) > The pthreading patch is already merge in upstream, and we don't have a > downstream branch for this project, so this can probably be MODIFIED now. We need to build downstream. However, we need the triple acks first as Allon shared in comment#3.
Yaniv, 1 - need to change vdsm dependency as well 2 - need a new build for vdsm 3 - what about the 3.4.z flag ? is nir's fix enough ? or do we need also to backport the pthreading fix ?
1. we wait to have the new pthreading in stable repo. already submitted 2. yes, we need new build for 3.4 , and build for pthreading that already in testing 3. nir fix is enough for this case, but better to have the pthreading fix available. which we do both as fast as possible
Fixed in: https://github.com/oVirt/pthreading/commit/b42f0acba4ad5a8fb971733fedd295e7d075afbc
How python-pthreading bug is fixed by vdsm?
Fixed in version seems wrong, shouldn't it be python-pthreading? Please correct.
ok, same version as for zstream BZ1119226.
rhev 3.5.0 was released. closing.