Bug 1230802

Summary: Backport of upstream bug #7316 for python 2.7
Product: [Fedora] Fedora Reporter: Flavio Grossi <flaviogrossi>
Component: pythonAssignee: Charalampos Stratakis <cstratak>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 24CC: bkabrda, dmalcolm, ivazqueznet, jberan, jonathansteffan, ncoghlan, pviktori, tomspur, tradej
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-08 11:59:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
backport of upstream #7316
none
test program to expose the bug none

Description Flavio Grossi 2015-06-11 14:40:07 UTC
Created attachment 1037733 [details]
backport of upstream #7316

Description of problem:
threading.Condition.wait(timeout=x) is implemented in python 2 threading
library as a semi-busy loop
(https://hg.python.org/cpython/file/ae1bc5b65e65/Lib/threading.py#l344). This
causes cpu wakeups and unexpected cpu use.

This bug is currently "fixed" in bug 917709
(https://bugzilla.redhat.com/show_bug.cgi?id=917709) by introducing a balancing
keyword argument to the stdlib. However this seems to have some problems:
- when balancing is used, threads waiting on a condition using a long timeout,
  will be notified with a big delay because the wait() call is using sleep().
  E.g. if wait is called with a timeout of 30s, the thread will be notified 30s
  after the notify(). This sounds very wrong to me.
- that fix doesn't apply to existing software or other library functions, e.g.
  using Queue.Queue().get(timeout=x) still has the problems described in
  #917709, as the attached python program shows.

This problem has been fixed upstream in python >= 3.2 by exposing a new api
using pthread_cond_wait() posix syscall.

The attached patch is a backport of the python 3 fix for python 2 (see
http://bugs.python.org/issue7316). Applying this patch and removing the one
described in #917709 should fix the problem (but please review my patch).


Version-Release number of selected component (if applicable):
python < 3


How reproducible:
running the attached script queue_get_timeout.py with python 2 and observing
an high cpu load. Thesame test under python 3 will not cause an high cpu load.


Steps to Reproduce:
1. run the attached script
2. observe high cpu load

Actual results:
high cpu load


Expected results:
idle cpu


Additional info:

Comment 1 Flavio Grossi 2015-06-11 14:40:49 UTC
Created attachment 1037734 [details]
test program to expose the bug

Comment 2 Nick Coghlan 2015-09-15 10:01:26 UTC
Even if this change doesn't get accepted upstream, I think it would be a good improvement to the previously applied downstream extension to the API, as it makes it consistent with Python 3 for the benefit of single source Python 2/3 code.

Comment 3 Flavio Grossi 2015-09-15 10:36:24 UTC
For those interested, the new upstream bug is
http://bugs.python.org/issue25084

and to me the only blocker, as noted there, is the need to backport the patch to other thread implementations.

Which ones are of interest for RedHat/Fedora? The only one seems to be thread_pth.h but it seems to me not to be used right now.

Comment 4 Nick Coghlan 2015-09-19 04:05:20 UTC
As far as I am aware, thread_pthread is the only threading backend used in Fedora and RHEL, so I'd be fine with a patch that threw RuntimeError if the timeout argument was set to a non-default value and a backend other than pthreads was in use.

Bringing over some of my Fedora (et al) specific comments from the upstream issue, the current situation we have in regards to the system Python is:

* Python 2 only code can pass "balanced=False" (but probably wouldn't want to due to the potential increase in wake-up latency)
* Python 3 only code can specify a timeout directly
* Single source Python 2/3 code that supports both Fedora and RHEL/CentOS 7 can't do either (since the API signatures are different)

If we update the Python 2.7 patch to be a backport of the Python 3 threading timeout API and implementation, then not only is the system Python better behaved by default, but the explicit timeout API becomes usable in single source Python 2/3 code (once the API fix is makes its way into a RHEL/CentOS 7 point release)

Comment 5 Flavio Grossi 2015-10-26 10:11:04 UTC
any news on this issue?

Comment 6 Fedora Admin XMLRPC Client 2016-01-29 13:05:13 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 7 Fedora End Of Life 2016-07-19 14:46:32 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 8 Fedora End Of Life 2017-07-25 18:57:34 UTC
This message is a reminder that Fedora 24 is nearing its end of life.
Approximately 2 (two) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 24. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '24'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 24 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 9 Fedora End Of Life 2017-08-08 11:59:00 UTC
Fedora 24 changed to end-of-life (EOL) status on 2017-08-08. Fedora 24 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.