Bug 1117795 - Thin provisioning disks broken on block storage when using pthreading 1.3
Summary: Thin provisioning disks broken on block storage when using pthreading 1.3
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: python-pthreading
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.5.0
Assignee: Yaniv Bronhaim
QA Contact: Jiri Belka
URL:
Whiteboard: infra
Depends On:
Blocks: 1118429 1119025 1119226 rhev3.5beta 1156165
TreeView+ depends on / blocked
 
Reported: 2014-07-09 12:07 UTC by Nir Soffer
Modified: 2016-02-10 19:45 UTC (History)
11 users (show)

Fixed In Version: python-pthreading-0.1.3-3
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1118429 1119025 1119226 (view as bug list)
Environment:
Last Closed: 2015-02-17 17:13:29 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Nir Soffer 2014-07-09 12:07:59 UTC
Description of problem:

python-pthreading 1.3 introduced a change which is not compatible
with vdsm.

When vdsm is monitoring the mailbox and a read operation fail,
vdsm try to log the error. However the code is ensuring first
that a lock is released by calling the locked() method of the
lock object. Unfortunately, pthreading.Lock does not implement
this method, which cause a new exception to be raised, causing
the mailbox thread to exit. From this point, disk extend request
sent over the mailbox are not handled, leading to pause of vms
when their disk become full.

This issue in pthreading exists since the first version - it never
implemented the locked() method, probably because it is not well
documented. The error was hidden by the fact that mailbox does not
use threading.Lock, which is replaced by pthreading.Lock when using
pthreading, but thread.allocate_lock. In version 1.3, pthreading
started to replace also thread.allocate_lock with pthreading.Lock,
causing mailbox code to fail.

Version-Release number of selected component (if applicable):
python-pthreading-0.1.3-1.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Wait until there is io error on the master domain and mailbox
   monitor thread exit
2. Watch vms pause

Workaround:

Downgrading python-pthreading to version python-pthreading-0.1.2-1.el6ev.noarch eliminates this issue.

Comment 1 Nir Soffer 2014-07-10 09:28:35 UTC
The vdsm patch eliminate this error in vdsm even with the broken pthreading version.

Not setting this to POST since pthreading fix is required to support older vedsm version anyway.

Comment 4 Nir Soffer 2014-07-10 14:18:03 UTC
The pthreading patch is already merge in upstream, and we don't have a downstream branch for this project, so this can probably be MODIFIED now.

Comment 5 Nir Soffer 2014-07-10 14:18:04 UTC
The pthreading patch is already merge in upstream, and we don't have a downstream branch for this project, so this can probably be MODIFIED now.

Comment 6 Douglas Schilling Landgraf 2014-07-10 16:52:24 UTC
(In reply to Nir Soffer from comment #4)
> The pthreading patch is already merge in upstream, and we don't have a
> downstream branch for this project, so this can probably be MODIFIED now.

We need to build downstream. However, we need the triple acks first as Allon shared in comment#3.

Comment 7 Barak 2014-07-13 17:03:17 UTC
Yaniv,

1 - need to change vdsm dependency as well
2 - need a new build for vdsm 
3 - what about the 3.4.z flag ? is nir's fix enough ? or do we need also to 
    backport the pthreading fix ?

Comment 8 Yaniv Bronhaim 2014-07-13 17:09:00 UTC
1. we wait to have the new pthreading in stable repo. already submitted
2. yes, we need new build for 3.4 , and build for pthreading that already in testing
3. nir fix is enough for this case, but better to have the pthreading fix available. which we do both as fast as possible

Comment 12 Nir Soffer 2014-07-18 18:47:38 UTC
How python-pthreading bug is fixed by vdsm?

Comment 13 Jiri Belka 2014-07-23 12:41:24 UTC
Fixed in version seems wrong, shouldn't it be python-pthreading? Please correct.

Comment 14 Jiri Belka 2014-07-23 12:59:23 UTC
ok, same version as for zstream BZ1119226.

Comment 16 Eyal Edri 2015-02-17 17:13:29 UTC
rhev 3.5.0 was released. closing.


Note You need to log in before you can comment on or make changes to this bug.