Bug 1862364 - Race condition in oslo_privsep daemon
Summary: Race condition in oslo_privsep daemon
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-oslo-privsep
Version: 16.1 (Train)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: z3
: 16.1 (Train on RHEL 8.2)
Assignee: mbollo
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks: 1783195
TreeView+ depends on / blocked
 
Reported: 2020-07-31 08:42 UTC by Cédric Jeanneret
Modified: 2020-12-15 18:36 UTC (History)
9 users (show)

Fixed In Version: python-oslo-privsep-1.33.4-1.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-15 18:36:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1887506 0 None None None 2020-08-25 11:41:48 UTC
OpenStack gerrit 740970 0 None MERGED Undo the eventlet monkey patch for the privileged daemon 2021-01-11 10:22:19 UTC
OpenStack gerrit 747906 0 None MERGED Undo the eventlet monkey patch for the privileged daemon 2021-01-11 10:22:56 UTC
OpenStack gerrit 747907 0 None MERGED Undo the eventlet monkey patch for the privileged daemon 2021-01-11 10:22:19 UTC
Red Hat Product Errata RHEA-2020:5413 0 None None None 2020-12-15 18:36:52 UTC

Description Cédric Jeanneret 2020-07-31 08:42:37 UTC
Description of problem:
A race condition might be triggered with the socket used by the daemon.py: since it's provided by an external process, it might happen that external process drops the socket before the end of privsep actions.

This leads to different issues, among them:
- stack trace with "BrokenPipeError: [Errno 32] Broken pipe"
- service hanging while trying to read the socket usint Threading - it will hand there forever and ever


Version-Release number of selected component (if applicable):
python3-oslo-privsep-1.33.3-0.20200310175027.ddde706.el8ost.noarch
python-oslo-privsep-lang-1.33.3-0.20200310175027.ddde706.el8ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create a socket for oslo_privsep
2. Start running privsep
3. Drop the socket in the middle of the action

Actual results:
We end with either a BrokenPipe, or a hanging process


Expected results:
It should fail more gracefully, and really NOT hang forever

Additional info:
This was detected while working on the following BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1783195

Comment 1 Hervé Beraud 2020-08-25 11:41:48 UTC
Hello,

Looks like to an eventlet issue.

Indeed, daemon is an independent process that executes, in privileged mode, the methods passed. Some of those methods make use of libraries like "os" or "threading". Sometimes, those methods are not evenlet safe: the GIL is given to the next user thread and is never returned, that ends in a command timeout.

Unless there is a major reason to monkey patch those libraries, I suggest to revert the monkey patched libraries [1][2][3] when the privileged daemon is forked. That will help to prevent this kind of hanging process and those command timeouts.

[1] https://review.opendev.org/#/c/740970/
[2] https://review.opendev.org/#/c/747906/
[3] https://review.opendev.org/#/c/747907/

Comment 17 errata-xmlrpc 2020-12-15 18:36:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.3 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:5413


Note You need to log in before you can comment on or make changes to this bug.